EGERTON UNIVERSITY COLLEGE OF OPEN AND DISTANCE LEARNING THE E-CAMPUS E-LEARNING COURSE BOTA 413: MOLECULAR AND MICROBIAL GENETICS By Prof. M.A. OKIROR mokiror@egerton.ac.ke +254722280311 June, 2020 __________________________________________________________ BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 1 OF 153 COURSE PRELIMINARIES BOTA 413: MOLECULAR AND MICROBIAL GENETICS Is this course for you? Course Preliminaries This course is designed for year four Bachelor of Education Science, and Bachelor of Science with a bias in biology undergraduate students. These are students who have had a basic understanding of the principles of heredity, to which they were exposed in the General Genetics course. With such grounding, these students can now explore the molecular basis of heredity. Through this course they will understand the basis of how a gene expressed the trait it is responsible for: the molecular structure of the gene, and understand how the message of the gene (as nucleic acid) is used to determine the phenotype. Microbial genetics on the other hand, demonstrates to the student heredity in microorganisms and the role of microorganisms in shaping molecular biological events, e.g. gene regulation, protein synthesis, etc. You are expected to complete the course in 45 hours within a period of one semester. There are no pre-requisites for you to study this course. Introduction to the course This course comprises two parts (sections): molecular genetics, and microbial genetics. In the first part, students are introduced the fundamentals of molecular genetics – history of molecular genetics, structure and function of the gene, gene regulation, gene mutation, etc. The BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 2 OF 153 gene at molecular level is a length (segment, section, piece, or fragment) of nucleic acid (DNA or RNA). Therefore, in this part, students are explained the relationship between the DNA (of which a gene is part of) and the phenotype! In the second part, micro-organisms and their place in molecular genetics breakthroughs are examined and appreciated. Mendel, the father of the modern genetics and contemporaries used larger organisms (macroorganisms) to develop the principles of heredity. However, as newer investigations were carried out and techniques of inquiry upgraded, it was realised that the organisms of choice for study of intrinsic details of the gene were smaller microscopic ones, largely bacteria and bacteriophages. Course Content There are NINE (9) topics in this course, namely: Topic One: Introduction to Molecular and Microbial Genetics Topic Two: Genes and Chromosomes Topic Three: Nucleic acids: As repositories of biological information Topic Four: Nucleic acids synthesis: The Enzymology of replication and transcription Topic Five: Translation: The Genetic code concept; Protein synthesis Topic Six: Mechanisms of gene regulation Topic Seven: Mutations: Nature and occurrence; DNA repair Topic Eight: Recombinant DNA technology Topic Nine: Microbial genetics: genetic exchange; gene mapping Course Learning Outcomes Upon successful completion of this course, you should be able to: BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 3 OF 153 i. Explain the evolution of molecular genetics and the place of microorganisms in the advancement and understanding of the molecular basis of heredity. ii. State the molecular structures of gene and chromosome. iii. Explain the chemical and physical structure of nucleic acid; dispel the early notion that protein was the molecule of heredity iv. Demonstrate how nucleic acids make copies of themselves, relating this to chromosome replication. v. Show the process of gene expression (i.e. transcription and translation. vi. Demonstrate that whereas organisms have many genes, the expression of these genes is regulated. vii. Explain how phenotype variations is brought about at the molecular level and show that gene alterations can be corrected. viii. Demonstrate that gene manipulation is possible and can be used to the benefit of organisms and science ix. Account for the fundamental place of micro-organisms in the advancement of molecular genetics Course Study Skills As an adult learner your approach to learning will be different to that from your school days: you will choose what you want to study, you will have professional and/or personal motivation for doing so and you will most likely be fitting your study activities around other professional or domestic responsibilities. Essentially you will be taking control of your learning environment. As a consequence, you will need to consider performance issues related to time management, goal setting, stress management, etc. Perhaps you will also BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 4 OF 153 need to reacquaint yourself in areas such as essay planning, coping with exams and using the web as a learning resource. Your most significant considerations will be time and space, that is, the time you dedicate to your learning and the environment in which you engage in that learning. We recommend that you take time now - before starting your self-study - to familiarize yourself with these issues. There are a number of excellent resources on the web. A few suggested links are: http://www.how-to-study.com/ The "How to study” web site is dedicated to study skills resources. You will find links to study preparation (a list of nine essentials for a good study place), taking notes, strategies for reading text books, using reference sources, test anxiety. http://www.ucc.vt.edu/stdysk/stdyhlp.html This is the web site of the Virginia Tech, Division of Student Affairs. You will find links to time scheduling (including a "where does time go?” link), a study skill checklist, basic concentration techniques, control of the study environment, note taking, how to read essays for analysis, and memory skills ("remembering”). http://www.howtostudy.org/resources.php This is another "How to study” web site with useful links to time management, efficient reading, questioning/listening/observing skills, getting the most out of doing ("hands-on” learning), memory building, tips for staying motivated, developing a learning plan. Need Help? BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 5 OF 153 This course was developed in June 2020 by Michael A. Okiror, Phone: +254722280311; Email: mokiror@egerton.ac.ke. Prof. Okiror is a Lecturer of Molecular Genetics in the Department of Biological Sciences at Egerton University. This session, the instructor for this course is Prof. Michael A. Okiror. My office is located in the Department of Biological Sciences Annex (Rm 5), Faculty of Science. You may consult me during the normal working hours between Monday and Friday or contact me through: Phone: +254722280311; Email: mokiror@egerton.ac.ke. For technical support e.g. lost passwords, broken links etc. please contact tech-support via e-mail elearning@egerton.ac.ke. You can also reach learner support through elearnersupport@egerton.ac.ke. Assignments/Activities Assignments/Activities are provided at the end of each topic. All assignments/activities will do not require submission. Course Learning Requirements Timely submission of practical schedules (15%) 2 CATs (15%) – each at 7.5 marks. Final Examination (70% of total score) Note book, calculator, laptop/computer or ipad Drawing book, pencil and rubber Commitment Discipline BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 6 OF 153 Self-assessment Self-assessments are provided in order to aid your understanding of the topic and course content. While they may not be graded, you are strongly advised to attempt them whenever they are available in a topic. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 7 OF 153 TOPIC ONE: INTRODUCTION TO MOLECULAR AND MICROBIAL GENETICS Introduction Welcome to topic one. This topic is intended to give you a broad preview of the science of molecular and microbial genetics. You will be guided through the concept of molecular genetics – definition and its importance. You will also be inducted into the importance of microbial genetics and its centrality in molecular discoveries involving the gene. Thus you will be prepared to learn in detail the two components of this course. Topic Time Compulsory online reading, activities, self-assessments and practice exercises [2 hours] Optional further reading [1.5 hours] Total student input [3.5 hours] Topic Learning Requirements Participation in one to two chats Learning Outcomes By the end of this topic you should be able to: i. Define the terms molecular genetics, molecular genetics ii. Define the schools that have contributed to the science of molecular genetics iii. State characteristics of genetic material BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 8 OF 153 iv. Explain why protein was considered the molecule of heredity instead of nucleic acids Topic Content Preview The rapid pace of research in genetics especially after 1900 led to increasing specialization, jargonisation and fragmentation of this branch of biology. Thus, today we have many sub disciplines of genetics, e.g. molecular, microbial, human, medical, ecological, behavioural genetics, etc. Definitions: Molecular genetics - is the subdivision of genetics concerned with understanding the structure and function of genes. Microbial genetics (also called bacterial and phage genetics) - is the genetic study of micro-organisms especially their role and usefulness in investigations on the principles and material basis of heredity and variation. MOLECULAR GENETICS 1.1 Introduction Molecular genetics is a recent development in the science of genetics, in fact of the mid C20th. It is a branch of genetics concerned with the study of all aspects of the gene, e.g. the identification of the chemical nature of the gene. It looks at a gene as a unit of biological information. Molecular geneticists aim to understand the way in which biological information is stored in genes and how the information is made available to the living cell. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 9 OF 153 However, the earliest reference or study of the chemical composition of cells is in the early 1860s by the German physician, F. Miescher. His study based on pus cells led to the isolation and purification of nuclein, the chemical material of cell nuclei. However, it was not until sometime in the 1940s/1950s that nuclein was demonstrated to contain DNA. According to Erwin Chargaff (1974), molecular genetics arose with the discovery of the transforming properties of DNA by Avery and others in 1944 and the introduction of phages as objects of biological research. By the latter half of C20th, the molecular structure and function of genes had been determined (ref. Hershey and Chase, Watson and Crick, etc). Meanwhile J. Monod and F. Jacob through their studies showed how gene expression is controlled. They proposed that certain genes regulate the activity of other genes. Three intellectual currents (also called schools) have helped shape the advancement of molecular biology. These are: 1. The British school, which was centred at Cambridge with such personalities as F. H Crick, J. D Watson, etc. This school was responsible for advancing our understanding of the molecular structure of nucleic acids and proteins. 2. The American school, based at the California Institute of Technology (Caltech) through people like Max Delbruck, A. Hershey, Salvador Luria, S. Benzer, G. Beadle, E. Tatum, J. Lederberg, etc. They are often referred to as the "Phage group". They contributed tremendously to our knowledge of DNA- its chemical nature through an experiment by Hershey and Chase, and replication; finer structure of the gene and certain aspects of phage genetics. Later, this group was responsible for the birth of bacterial genetics! Delbruck, Hershey and Luria in 1969 were awarded the Nobel Prize for BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 10 OF 153 Physiology or Medicine for their work concerning the replication mechanism and genetic makeup of viruses. 3. The French school, which was based at the Louis Pasteur Institute, Paris was led by Francois Jacob, and had Elie.L. Wollman, Jacques Monod and Arthur B. Pardee. Their contributions lie mainly in regulation of gene expression and bacterial sexuality. They are particularly remembered for their insightful research, designated as the “pajamo” experiment. Jacob and Monod won the Nobel Prize in Physiology or Medicine in 1965, sharing it with Andre Lwoff "for their discoveries concerning genetic control of enzyme and virus synthesis. The birth of molecular biology not only unified the life sciences and furnished a firm foothold of these sciences in physics and chemistry but also marked the commencement of a matchless period of growth of genetics leading to the establishment of a link between genetics and biochemistry. An outstanding product of "Genetics-biochemistry" merger was the Cracking of the genetic code in the 1960s. This and the other major advances during the 1950 - 1980 era can be safely attributed to: a. the introduction of new methods of analyses; b. the use of electron microscopic techniques, and c. the employment of sophisticated biophysical and biochemical instruments and techniques to the study of micro-organisms and viruses. In the past three or so decades, genetics has taken a revolutionary and sometimes controversial turn with the development of recombinant DNA technology. By this approach, a revolution has slowly but surely taken place in Science, Agriculture, Medicine and Industry. Recombinant DNA technology encompasses a variety of techniques for isolating, analysing and manipulating individual genes. Today, it is now routine to isolate a segment BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 11 OF 153 of DNA, purify it, amplify and sequence it, and use as may be desired, e.g. introduce it into other organisms where it will be expressed. This technology is making it possible to study in greater detail the molecular structure and function of genes, their regulation especially in multicellular organisms (plants and animals). Below is a citation of important persons, their studies and the years of such studies that have made an impact to the development of the science of molecular genetics: Year Discovery / contribution 1928 1934 Discovery of transformation in bacteria (F. Griffith); Demonstration that certain phages are made of DNA and protein (M. Schlesinger). 1944 DNA is the hereditary material (O.T. Avery, C.M. MacLeod, and M. McCarty). 1950 Equivalence between A and T and between G and C in DNA (E. Chargaff). 1952 When a phage infects a host cell, the phage DNA enters the host but the protein does not (Hershey/Chase). 1953 The double helix model of DNA structure (J.D. Watson / F.H.C. Crick). 1955 Elucidation of the structure of a gene and definitions of cistron, recon and muton (S. Benzer). 1955 Reconstitution of hybrid TMV by mixing protein and RNA components derived from separate sources (H. Fraenkel Conrat, R.C. Williams). 1956 Enzymatic synthesis of DNA in vitro (A. Kornberg et al.). 1957 Unravelling the molecular basis of the difference between A and S haemoglobin (V.M. Ingram). BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 12 OF 153 1958 Determination of the distribution of old and new DNA to the progeny of bacterial cells by means of density-gradient centrifugation technique (M. Messelson and F.W. Stahl). 1961 mRNA and its significance (S. Brenner, F. Gros, F. Jacob, J. Monod). 1961 DNA-RNA hybridization (B.D. Hall, S. Spiegelman). 1961 Universality of the genetic code (G. von Ehrenstein, F. Lipmann). 1961 The codons are triplets of the base pairs of DNA (Crick et 1961 1961 Cracking of the genetic code (M. Nirenberg, J.H. Matthaei). The operon concept in E. coli (Jacob / Monod) 1964 Colinearity between gene and its protein product (A.S. Sarabhai et al., Yanofsky et al.). 1964 Mechanism of excision repair in bacterial DNA (R.B. Setlow, etc). 1965 Chain-terminating codons AUG, UAA (S. Brenner, A.O.W. Stretton, S. Kaplan). 1965-68 Restriction (site specific) endonucleases in E. coli (W. Arber et al.). 1967 Use of polynucleotides with known repeating sequences in the elucidation and study of the genetic code (H.G. Khorana et al.). 1967 mRNA can originate from both strands of the DNA (K. Taylor, W. Szybalski et al.). 1970 Reverse transcription: RNA-dependent DNA polymerases occur in oncogenic viruses (D. Baltimore, H.M. Temin). 1970-80 The age of plasmids, cloning vehicles, restriction endonucleases, ligases and genetic engineering (Several workers). al.). 1.2 THE GENETIC MATERIAL Until 1944, it was not clear what chemical component of the chromosomes constitutes the genetic material, protein or nucleic acids. Early BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 13 OF 153 molecular geneticists therefore were hard pressed to identify the physical and chemical nature of the genetic material. Extensive chemical analyses of chromosomes of different organisms showed chromosomes to contain proteins and nucleic acids (DNA and RNA). Because of its abundance, diversity and complexity within the cell, the role of genetic information carriers was assigned to proteins. This controversy persisted till about mid-C20th when the first direct experimental evidence identified DNA as the informational basis for the process of heredity. DNA is found in most micro-organisms and higher organisms. Later, RNA was found to be the genetic material of some viruses. (1) Characteristics of the genetic material In order to function as an informative family of molecules, the chemical substances that makeup hereditary determinants must have at least the following attributes: (a) Expression. It must be capable of spelling out specific messages just as the letters of an alphabet make meaningful words. The expression of alternative traits, e.g. round vs wrinkled, is essential in identifying genes through the observance of segregation of alleles in mating. (b) Replication. The material of heredity must be able to replicate itself in order to be passed on to daughter cells. This it does at interphase. (c) Variation. The genetic material be able to vary as a result of mutation to give alternative phenotype. 1.3 Protein as the genetic material Although proteins and nucleic acids were both considered major candidates for the role of the genetic material, many geneticists, until the 1940s, favoured proteins. This was for the following reasons: BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 14 OF 153 (a) Abundance and diversity. Proteins constitute about 50% dry weight of the cells and are of a wide variety. (b) Chemical structure of nucleic acids. It is simple and only repeats of the 4 nucleotide, A, G, C, and T. (c) Level of research. Before 1940, most geneticists were engaged in the study of transmission genetics and mutation. The excitement generated in these areas undoubtedly diluted the concern for finding the precise molecule that serves as the genetic material. Topic Summary In this topic, you have learned how the science of molecular genetics was incepted and grew to its current status. You have also been demonstrated the central role microorganisms play in the advancement of molecular biology in general and molecular genetics in particular. A brief summary has been presented of individuals/institutions that have incubated this science. Further Reading George W. Burns: The Science of Genetics (5/e) TOPIC ACTIVITIES Attempt these questions as a test to your understanding of the topic. 1. Which schools and what is the contribution of each to the advancement of the science of Molecular genetics? 2. For any material to be regarded as heredity, what characteristics should define it? 3. Why were proteins held for long as the material of heredity? BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 15 OF 153 TOPIC 2. GENES AND CHROMOSOMES Introduction Welcome to our topic two of this course. As the title indicates, we are going to examine in detail genes and chromosomes – organelles considered central to heredity. We shall give examples of gene and chromosome numbers in some species. Also to be learned is the relationship between DNA content and organismal complexity. Topic Time . It is estimated that this topic can be fruitfully covered in one lecture hour with an additional 1.5 hours for supplementary reading and assignments. Topic Learning Requirements There are no special learning requirements except your Bota 111 notes as refresher. Learning Outcomes After successful completion of this topic, you should be able to: i. Give both classical and molecular definitions of the gene ii. Explain the “fine structure” of the gene. iii. Describe the chemical composition of chromosomes and how this defines them as carriers of hereditary information BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 16 OF 153 iv. Describe one study that illustrates gene-protein relationship v. Demonstrate variations in C-value with cyclic changes Topic Content 2.1 GENES Definition In year one General Genetics course, we defined a gene as a “fundamental unit of inheritance” or a “unit of information about a heritable trait”. However, in at the molecular level, we wish to define a gene in a corresponding manner, that is as “a stretch of DNA that specifies the manufacture of a single type of protein (or for some genes, certain RNAs). Thus, a gene contains a set of instructions (information) about a heritable trait. Genes are the working subunits of DNA. The DNA in each chromosome constitutes many genes. Each gene contains a particular set of instructions, usually coding for a particular protein 0or for a particular function. Mendel first discovered genes in 1865, but called them factors of heredity. They remained so until 1909 when Johannsen called them genes. Their importance was not realized until the start of C20th. For many years the chemical nature of the gene, the molecular structure of DNA and the correspondence between genes and proteins were a mystery. These too were resolved in the 1950s. Inheritance, distribution and number Every living thing inherits from its parents a set of genes that determine its development and appearance. Genes are disposed along the chromosomes in a fixed linear order characteristic of the spp. The number of BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 17 OF 153 genes for any species is not yet known accurately but probably varies from several thousand for simple bacteria to many thousands for a mammal. Prior to the Human Genome Project, a conservative estimate put the number of genes in a human being at between 50,000 to 140,000 genes distributed along the 22 pairs of autosomal and sex (X and Y) chromosomes. Thus, each chromosome contains around 1,000 genes. 2.2 CHROMOSOMES These are the vehicles of heredity. This is so because they are the cellular location of genes. This phenomenon was demonstrated by W. Sutton and T. Boveri in 1902, and confirmed by T.H. Morgan in 1913. He and his students collected a lot of data on Drosophila. They localised several genes to specific chromosomes. They are largely made of nucleoproteins (i.e. a complex of organic acids, DNA or RNA, and histone proteins). The nucleoprotein complex is called chromatin. It is chiefly of two types, heterochromatin and euchromatin. Prokaryotic cells have no nucleus and their ‘chromosome’ is a single circular molecule of DNA anchored to the cell membrane. The prokaryotic chromosome is a complex of about 20% proteins and 80% DNA and forms a compact mass- the nucleoid. The chromosomes of eukaryotic cells on the other hand are highly organised structures containing about twice as much proteins as DNA. Chromosomes are characterized on the basis of physical characteristics, e.g. size, centromere position, etc. Two cellular processes -mitosis and meiosis- ensure their distribution in a cell. Chromosome number in a cell is either haploid (n) or diploid (2n), or greater (i.e. polyploidy. Single maternal and paternal chromosomes constitute a homologous pair and are identical in size and shape. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 18 OF 153 2.3 FINE STRUCTURE OF THE GENE With the acceptance of the chromosome theory of inheritance, genes came to be thought of as beads on a chromosomal string. Mutant alleles of a single gene were considered as beads of different colours, with only one bead of a particular colour on each string. Recombination was believed to involve breaking and re-joining of the string at positions between beads while recombination within a gene was thought not to occur. The gene was regarded as an indivisible entity, and it was even defined as the basic unit of recombination as well as the unit of function and mutation. Due to lack of resolving power of the genetic systems, this theory was accepted till about the 1940s. With the introduction of the microbial genetic systems by Delbruck and colleagues, the advancement in experimental techniques, the molecular structure of genes could then be analysed. Seymour Benzer was able to show that the detailed gene structure would not fit into the definition of the bead theory. He used a genetic system that detected extremely small recombination percentages. He demonstrated that whereas a gene can be defined as a unit of function (Cistron), it could be subdivided into a linear array of sites that are mutable (Mutons) and that can be recombined (Recons). These later studies sharply defined the basic unit of mutation, recombination and genetic function. These studies destroyed the indivisible bead theory. In its place came the concept that a gene is sequence of nucleotide pairs. 2.4 GENE-PROTEIN RELATIONSHIP BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 19 OF 153 The most fruitful endeavors to find correspondence between genes and proteins examined the ways in which changes in the gene affected kind and function of proteins in the cell. Among the pioneers were: (1) Archibald E. Garrod. Between 1898 and 1902, he examined the human condition alkaptonuria. This arose due to inability of the individual to completely metabolise the amino acid tyrosine in the diet producing instead an intermediate product called homogentisic acid (or phenylpyruvic acid). By carrying out a pedigree analysis and genetic studies on this trait, Garrod concluded that this was a heritable abnormality arising out of a mutation in a single gene. A defective gene led to the production of a non-functional enzyme, homogentisic acid oxidase! Thus, by the start of C20th Garrod had demonstrated a relationship between a gene and a protein (or enzyme). Unfortunately, Garrod's notions, like those of Mendel seem to have been so far ahead of their time that they had little influence in the market place of genetic ideas until their rediscovery 30 years later! However, in between Garrod’s study and that of Beadle and Tatum, (i.e. 1920 to 1940), three inconclusive studies were made to establish a relationship between genes and enzymes. These were on plant pigments, eye color in butterflies, and eye color in Drosophila. (2) George W. Beadle and Edward L. Tatum. In 1941, they carried out a classical study and showed that genes controlled enzyme synthesis. They used a fungus Neurospora that can be grown in a simple, synthetic medium provided biotin (vitamin) is included. Varying the medium allowed them to isolate several mutants (1 in 200 spores) that could not grow in the minimal medium but could grow in the complete medium (enriched medium). BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 20 OF 153 Inference The unusual fungi are the descendants of mutant spores in whose hereditary material a gene mutation has blocked an essential metabolic pathway. Conclusion On the basis of such observation, they promulgated the "one gene one enzyme hypothesis". This hypothesis asserts that “each gene has only one primary function, to direct the formation of one and only one enzyme, and thus in controlling the single chemical reaction catalysed by that one enzyme”. Since then, it has been repeatedly demonstrated that a gene can only control the production of one polypeptide, and therefore this hypothesis has since been renamed “one gene - one polypeptide”. In 1958, the two scientists together with Lederberg shared the Nobel Prize in Physiology or Medicine for their demonstration that a mutation of a gene would produce a corresponding loss of an enzyme and an alteration of a cell's metabolism. 2.5 DNA AND MORPHOLOGICAL COMPLEXITY A genome is an organism’s complete cell of DNA including all the genes. Each cell has a genome! The genome size does not always correlate with the complexity of the organism and, in fact, shows great variation in size and gene number. Genome size is usually measured in base pairs (bps) or bases in single-stranded DNA or RNA. The amount of DNA in the nucleus varies between plants (organisms), both within and between species. Because the amount of DNA also changes according to the cell cycle, it is common to compare values during the G1 phase, before replication has occurred. This amount is known as the 2C value, meaning the content of DNA in a somatic nucleus. Having established the role and function of DNA, speculation logically set BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 21 OF 153 in that the amount of DNA in the nucleus (genome) was related to the organism’s complexity (Table). For instance, yeasts, Drosophila, chickens and humans have successively larger amounts of DNA in their haploid chromosome sets (0.05, 0.15, 0.3, and 3.2 picograms, respectively), in increasing complexity of these organisms. The DNA content of various cells Size of genome: bpsa length (mm) Organism Maximum number of: proteins encoded chroms. b (n) 1. Prokaryotes: a) E. coli 4.5 x 106 1.36 3.3 x 103 1 (yeast) 7 x 107 4.60 1.125 x 104 17 D. melanogaster 1.65 x 108 56.00 1.375 x 106 4 990 2.42 x 106 23 2. Eukaryotes: S. cerevisiae 3. Homo sapiens: Man 3 x 109 ------------------------------------------------------------------------------------------------------a = bps = base pairs; 1 Kb = 1000 nucleotides, b = assuming 1,200 bp/protein This apparently was the belief of many scientists. Later, experiments showed that such a relationship was not consistent. The vertebrate species with the greatest amount of DNA/cell are amphibians/reptiles (e.g. toad = 4.2; lizard 3.5 cf 3.0 for man) which are surely less complex than humans in both structure and behaviour. Besides, there is also considerable intra-group variation in DNA content. For example, all insects or all amphibians would appear to be similarly BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 22 OF 153 complex, but the amount of haploid DNA within each of these phyla varies by a factor of 100! Also, similar plants can have strikingly different karyotypes. The broad bean, Vicia faba (2n = 14) has about half the chromosome number of the Kidney plant (P. vulgaris) (2n = 22). However, it has about 3-4 times as much DNA/cell as the kidney bean. Findings such as above have given rise to the “C - value paradox” (i.e., failure of C-value to correspond to phylogenetic complexity). The reason is not clear. From the above facts it would seem that some of the DNA in certain organisms is "extra" or expendable; apparently not all DNA in all organisms is for encoding proteins. For instance, a third of the human genome is redundant. Of the balance, there are regions of control and introns that further reduce the effective quantity. In bacteria, however, the entire DNA is used to code information or control its expression. 2.6 THE C-VALUE CONCEPT The C-value is another measure of genome size. The C-value refers to the amount, in picograms, of DNA contained within a haploid nucleus (e.g. a gamete) or one half the amount in a diploid somatic cell of a eukaryotic organism. Since DNA and proteins form the chromatin of which chromosomes are made of, any changes in chromosome number in the cell affects the amount of DNA in that cell (Fig. below). If C represents the amount of DNA in a haploid cell (gamete) before fertilization, then a gamete will have an amount of 1C and a zygote 2C. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 23 OF 153 During the S period of pre-meiotic mitosis, the DNA content in the cell will rise to 4C. Mitotic anaphase will bring it back down to 2C. During the S immediately preceding meiosis, it will rise back to 4C. Anaphase 1 will reduce the DNA content to 2C and anaphase 2 to the original 1C value. mitotic divs amount of DNA/ 4C _ S A meiotic divs S A1 cell (C-value) 3C _ Fertilzn 2C _ zygote // 1C _gamete A2 gamete Sequence of stages Fig. 1. Changes in the amount of DNA in a cell of a plant or animal, which undergoes mitosis then meiosis. Topic Summary In this topic we have examined the gene and chromosome in molecular terms and defining their core functions in heredity. At the molecular level Benzer showed a gene to be a pair of nucleotides each of which can undergo mutation and/or recombination. Through studies of personalities like Garrod BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 24 OF 153 or Beadle and Tatum we have learnt that there is an intricate relationship between genes and proteins. As to whether there is a link between the cellular DNA content and organismal complexity, this has been shown to be largely so but not universal. Finally, we have considered changes that can occur to the DNA content in mitotic cells and those that produce gametes. Further Reading (1) Suzuki, D.T., Griffiths, Miller, J.H. and Lewontin, R.C. (1986). An Introduction to Genetic Analysis. (3/e). W.H. Freeman and Co. NY TOPIC ACTIVITIES Attempt these questions to evaluate your understanding of the topic 1. Explain what these terms mean: gene, muton, cistron, recon, and c-value concept. 2. What do you understand by “C-value paradox”? 3. Describe one study to show a relationship between a gene and protein. 4. Explain changes that can occur to DNA content in mitotic and meiotic cells. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 25 OF 153 TOPIC 3. NUCLEIC ACIDS (DNA AND RNA) AND THEIR ROLES AS REPOSITORIES OF GENETIC INFORMATION Introduction Welcome to our third topic of this course. In this topic it is demonstrated to us that nucleic acids (DNA and RNA) are molecules of heredity and not protein. This is followed by examination of the chemical and physical structures of DNA and elucidation that both these nucleic acids are the repositories of biological information. Topic Time It is envisaged that three lecture hours is adequate for the coverage of this topic. The student is expected to devote at least two more hours for selfrevision and going through the assignment. Topic Learning Requirements No specific requirements are proposed in this topic but accumulated understanding of what has been covered is an advantage. Learning Outcomes After successfully completing this topic, you should be able to: i. Describe those experiments that led to the acceptance of DNA and RNA as the molecules of heredity ii. Describe both the chemical and physical structures of DNA and RNA Topic Content BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 26 OF 153 3.1 INTRODUCTION In General genetics, we learnt how Mendel and his successors used data from breeding experiments to deduce the presence and activity of genes, analyse the effect of alternative alleles on phenotype, etc. With this approach, it is possible to predict the outcomes of genetic crosses in the absence of detailed knowledge of the molecules and biochemical reactions that underlie the events observed. However, it is impossible to decipher the exact mechanisms by which genes determine phenotypes, transmit instructions between generations, and evolve new information without understanding what they are made of. Thus, the necessity to examine DNA - the molecule that is the genetic material. 3.2 CHEMICAL AND PHYSICAL STRUCTURE OF DNA 1. Chemical structure of DNA For a long time, scientists did not imagine the physical structure of DNA to be complex given that it was merely repeats of the 4 (tetra) nucleotides. However, with the application of novel and more advanced techniques, Erwin Chargaff, a biochemist (1949-53) was able to measure the amount of DNA in tissues and cells and determine the relative proportions of the bases present in the DNA samples of several species. Through his results, Chargaff convinced the scientific world that DNA had the chemical complexity necessary of genetic material and thus formulated the following rule: (1) the amount of DNA extracted from different tissues of the same organism is the same; (2) the base composition in a given DNA molecule is constant, i.e. A = T, and G = C. For instance, in humans, A = 30%; T = 29.4%; G = 19.9%; C = 19.8%. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 27 OF 153 (3) that DNA contains equal amounts of purines and pyrimidines, i.e. A + G: C + T as a rule is close to 1.00 regardless of the organism used as the DNA source. Chargaff's findings, as will be seen, played an important part in the discovery of the double helix structure of DNA. DNA like RNA is a polymeric molecule made up of monomeric units called nucleotides. A nucleotide has three basic components: a pentose (5-carbon) sugar, deoxyribose; a phosphate group; and a nitrogenous (nitrogen-containing) base. (1) Nitrogenous bases: two kinds - a 9-membered double-ringed purines (two), Adenine and Guanine; a 6-membered single-ringed pyrimidines (three), Cytosine, Thymine and Uracil. These bases are commonly represented by their first letters: A, G, C, T, and U. (2) Pentose sugar: this 5C-sugar gives the nucleic acid its name, deoxyribose for DNA, and ribose for RNA. The former has an O group missing on C2-. (3) A phosphate group: In a nucleic acid polymer, this group joins two nucleosides to each other by forming a phosphodiester bridge between the C5' of one sugar and C3' of another. Note: A combination of a base and a sugar = nucleoside, while a nucleoside plus a phosphate group is a nucleotide. A mononucleotide can be described as a nucleoside monophosphate. When two or three phosphates groups are on a nucleoside we have nucleoside diphosphate and nucleoside triphosphate respectively. The triphosphate form serves as a pre-cursor molecule during nucleic acid synthesis. Also, the triphosphates like ATP and GTP are in the cell's bioenergetics. Nucleosides and nucleotides are named according to the specific nitrogenous base that is part of the building block, viz: (a) RNA BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 28 OF 153 Ribonucleosides Ribonucleotides Adenosine (A) Adenylic acid (AMP) Cytidine (C) Cytidylic acid (CMP) Guanosine (G) Guanylic acid (GMP) Uridine (U) Uridylic acid (UMP) (b) DNA Deoxyribonucleosides Deoxyribonucleotides Deoxyadenosine (dA) Deoxyadenylic acid (dAMP) Deoxycytidine (dC) Deoxycytidylic acid (dCMP) Deoxyguanosine (dG) Deoxyguanylic acid (dGMP) Deoxythymidine (dT) Deoxythymidylic acid (TMP) 2. Physical structure of DNA Maurice Wilkins and Rosalind Franklin in the 1940s used the x-ray crystallography technique to try and determine the physical structure of a DNA molecule by capturing diffraction images on a photographic plate. By analysing the photograph, they obtained information about the molecule’s atomic structure. They noticed a few repeating distances in the molecule: 0.34 nm (3.4A), 2 nm (20A) and 3.4 nm (34A). Franklin also saw a pattern indicating that the molecule was helical (corkscrew). She also argued that there were probably two strands in each molecule not 1 or 3. 3. The DNA model (According to Watson and Crick) The final model was thus built according to information of Chargaff, Wilkins and Franklin. Thus, in April 1953 Watson and Crick published their findings in the scientific journal of Nature – a double helical structure of DNA held together by cross-connections of paired bases, and resembling a spiral BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 29 OF 153 staircase in which the base to base attachments represent the series of steps. The two chains are coiled around a common axis called the axis of symmetry. It is important to note that whereas most DNA is double stranded, a few viruses contain single stranded DNA! The measurements: (i) a residue on each chain every 3.4 A; (ii) an angle of 360 between adjacent residues in the same chain so that the structure repeats after ten residues on each chain, i.e. after 34 A. (iii) the distance of a phosphate atom from the axis is 10A The backbone of the helix is a chain of S-P alternating. The two chains (strands) run in a 5' to 3’ direction and are anti-parallel and complementary (but not identical). It is noteworthy to mention here that other earlier scientists had given other models of DNA, e.g. Pauling and Corey; Fraser all of whom had suggested a 3 chain intertwined model (structure). A typical eukaryotic chromosome consists of a DNA molecule many millions of nucleotides long. The length of a DNA molecule is usu. given in base pairs or kilo base pairs (bps or Kbps). 3.3 DNA AS THE GENETIC MATERIAL By 1958, Crick had presented the general rules for the unidirectional flow of genetic information from DNA to the creation of proteins, as follows: transcription DNA translation RNA Enzyme This information transfer did not occur from protein to protein or "backward" from protein to RNA or to DNA. This one-directional transfer hypothesis became known as the "Central Dogma of Molecular Biology". Later however, some scientists have questioned the validity of the central dogma, especially with regard to the RNA viruses. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 30 OF 153 3.4 DNA AS THE REPOSITORY OF GENETIC INFORMATION The genetic information is encoded in the sequence of the bases along the strands of the helix and represents the instructions for making proteins and RNA types. In the nucleus, DNA replicates and/or transcription occurs. Special enzymes are involved. In the cytoplasm, protein synthesis occurs. 3.5 PROOFS OF DNA AS THE GENETIC MATERIAL 1. Chromosome composition: Eukaryotic chromosomes consist of one long, linear molecule of DNA bound to a complex of proteins to form chromatin. 2. Experimental Investigations: (a) First by F. Griffith (1928) through the transformation experiment; (b) Secondly by Avery, MacLeod and McCarty (1944); (c) Thirdly by Hershey and Chase (1952) through the Great Kitchen blender experiment; pointed to DNA as the genetic material. Their experiments showed that bacterial cells expressing one phenotype can be transformed into cells with a different phenotype, and that the transforming agent is DNA; (a) The transformation experiment Griffith (1927/8) demonstrated the process of transformation. His research provided the foundation for the work of Avery et al. Griffith's experiment: Heat-killed, virulent S-type cells mouse living, non-virulent R-type cells mouse BOTA 413: MOLECULAR AND MICROBIAL GENETICS mouse PAGE 31 OF 153 healthy mouse death by pneumonia healthy mouse No bacteria recovered Living, virulent s-type bacteria present No bacteria recovered His results set the stage for the elucidation of the chemical nature of the "transforming principle". (b) Identification of the Transforming principle Oswald T. Avery, C.M. MacLeod and M. McCarty identified the transforming principle. Published their work in a classical paper in 1944 after 10 years of research. s Kill and fraction into: Polysaccharides lipids RNA protein DNA Live=R cells R R R R R S They found DNA is the only agent that produces smooth (S) colonies when mixed with the live R cells! Despite this evidence, many within the scientific community still resisted the idea that DNA is the molecule of heredity. They argued that perhaps Avery’s results reflected the activity of contaminants. Unconvinced for the moment, these scientists remained attached to the idea that proteins are the prime candidates for the genetic material. This discovery however, did have BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 32 OF 153 the impact of initiating studies in molecular genetics with the use of microorganisms. (c) The Hershey-Chase bacteriophage experiment In their now-legendary experiments, Alfred D. Hershey and Martha C. Chase studied bacteriophage. The phages they used were simple particles composed of protein and DNA, with the outer structures made of protein and the inner core consisting of DNA. Hershey and Chase knew that the phages attached to the surface of a host bacterial cell and injected some substance (either DNA or protein) into the host. This substance gave "instructions" that caused the host bacterium to start making lots and lots of phages—in other words, it was the phage's genetic material. Before the experiment, Hershey thought that the genetic material would prove to be protein. To establish whether the phage injected DNA or protein into host bacteria, Hershey and Chase prepared two different batches of phage. In each batch, the phages were produced in the presence of a specific radioactive element, which was incorporated into the macromolecules (DNA and protein) that made up the phage. One sample was produced in the presence of 35S, a radioactive isotope of sulfur. Sulfur is found in many proteins and is absent from DNA, so only phage proteins were radioactively labeled by this treatment. The other sample was produced in the presence of 32P, a radioactive isotope of phosphorous. Phosphorous is found in DNA and not in proteins. So only phage DNA (and not phage proteins) is radioactively labeled by this treatment. Each batch of phage was used to infect a different culture of bacteria. After infection had taken place, each culture was whirled in a blender, removing any remaining phage and phage parts from the outside of BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 33 OF 153 the bacterial cells. Finally, the cultures were centrifuged, or spun at high speeds, to separate the bacteria from the phage debris. Centrifugation causes heavier material, such as bacteria, to move to the bottom of the tube and form a lump called a pellet. Lighter material, such as the medium (broth) used to grow the cultures, along with phage and phage parts, remains near the top of the tube and forms a liquid layer called the supernatant. Radioactivity was measured in the pellet and liquid (supernatant) for each experiment. P was found in the pellet (inside the bacteria), while 32 S was found in 35 the supernatant (outside of the bacteria) When Hershey and Chase measured radioactivity in the pellet and supernatant from both of their experiments, they found that a large amount of 32P, appeared in the pellet, whereas almost all of the 35S, appeared in the supernatant. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 34 OF 153 Conclusion Based on this experiment, Hershey and Chase concluded that DNA, not protein, was injected into host cells and made up the genetic material of the phage. 3.6 RNA AS THE GENETIC MATERIAL Viruses that infect and parasite plant cells, some animal cells contain RNA only. In these viruses, RNA acts as genetic material. One plant virus, Tobacco mosaic virus(TMV), that contains RNA, not DNA was an important tool for genetic experiments. TMV infects tobacco, causing the infected regions on leaves to become discolored and bristled. Different strains of TMV produce clearly different inherited lesions on the infected leaves. The common virus produces a green mosaic disease, but a variant Holmes rib grass(TMV-HR), produces ring spot lesions. Moreover, the amino acid compositions of the proteins of these two strains differ. The demonstration that RNA is the genetic material came in several studies: 1. In 1956 A. Gierer and Gerhard Schramm exposed tobacco plant tissue to purified RNA from tobacco mosaic virus (TMV), and the plants developed the same types of lesions as if they were exposed to the virus itself. What would be the results if the RNA was treated with DNase, RNase, or protease prior to its exposure to the plant tissue? It was concluded that RNA is the genetic material of this virus. 2. In 1957, Heinz Fraenkel-Conrat and B. Singer reported another type of experiment with TMV. They use two tobacco viruses – TMV and its variant, HR virus. From the two strains of TMV they were able to reconstitute viruses with the RNA from TMV common enclosed in TMVHR protein and TMV-HR RNA with TMV common protein. When these reassembled viruses were used to infect tobacco leaves, the progeny BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 35 OF 153 viruses produced were always found to be phenotypically and genotypically identical to the parent strain from which the RNA had been obtained. The reassembled viruses with the TMV-RNA and TMVHR protein produced a green mosaic disease characteristic of TMVcommon. Recovered virus had protein characteristic of TMV common. This proved that specificity of virus proteins was determined by RNA alone and that proteins carried no genetic information. Hence RNA carries genetic information not proteins. The genetic RNA is usually found to be single stranded but in some it is double stranded as in retrovirus, wound tumor virus. 3. In 1965 and 1966 Norman R. Pace and Sol Spiegelman further demonstrated that RNA from the phage Q-Beta could be isolated and replicated in vitro. Replication was dependent on an enzyme Q-Beta RNA replicase, which was isolated from host E. coli cells following normal infection. When the RNA replicated in vitro was added to E. coli protoplasts, infection and viral multiplication occurred. Thus, RNA synthesised in a test tube can amply serve as the genetic material in the phages. Topic Summary In this topic, you have been introduced to the two kinds on nucleic acids, DNA and RNA, the former having been isolated in pus cells of humans and the latter from yeast cells. We have examined several aspects of these acids: their physical and chemical structures; how the physical model was arrived at by Watson and Crick; and given proofs that they are indeed the materials of heredity. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 36 OF 153 Further Reading (1) Burns, G.W. (1983). The Science of Genetics: An introduction to heredity. (5/e). Macmillan Pub.Co., NY. (2) Russell, P.J. (1992). Genetics (3/e). HarperCollins Publishers, NY TOPIC ACTIVITIES Attempt the questions below as you consolidate your understanding of this topic. 1. What do you understand by “Chargaff’s rule”? 2. What is a nucleotide? How does it differ from a nucleoside? 3. What is the significance of these values (3.4 A, 34 A, 10 A and 360) in the DNA physical model? 4. Outline one proof each that DNA and RNA are the genetic material. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 37 OF 153 TOPIC 4. NUCLEIC ACIDS SYNTHESIS: REPLICATION AND TRANSCRIPTION Introduction In topic one, we learned that a molecule that is hereditary must be able to make copies of itself and also be able to express information therein. Following the confirmation of DNA as the hereditary molecule, it is necessary now to demonstrate that in deed replication and transcription can take place. Topic Time Consideration of the two fundamental activities – replication and transcription – require progressive exploration. They can be successfully covered in five lecture hours with the student availing additional time for self-study and doing assignments. Topic learning Requirements . Notes of BOTA 111 for an overview of this topic Learning Outcomes After successfully completing this topic, you should be able to: i. List the modes earlier proposed for DNA replication ii. Describe how the correct mode was identified and confirmed iii. Describe the process of replication of double stranded DNA as well as single stranded DNA iv. Define and demonstrate transcription in prokaryotes and eukaryotes v. Define split genes and the need for RNA processing to produce mature transcripts BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 38 OF 153 Topic Content Introduction After proposing the double-helical DNA structure, Watson and Crick remarked that, "the specific pairing they'd postulated suggested immediately a possible copying mechanism for the genetic material". Their prediction was in deed true. Any molecule that is to serve as a repository of genetic information must be able to act as a template for the synthesis of an exact copy of itself to pass on to daughter cells at cell division. Amongst biological molecules, only the nucleic acids, DNA and RNA, can do this. Summary of guides to nucleic acids synthesis (1) Both DNA and RNA chains are produced in cells by copying of a preexisting DNA strand according to the rules of Watson and Crick of base pairing. In the replication of a duplex, both strands are copied. In some viruses, the copying of a pre-existing RNA molecule produces RNA molecules; in the retroviruses, the copying of RNA produces DNA. (2) Nucleic acid strand growth is in one direction, 5- to 3-. All RNA and DNA synthesis, both cellular and viral, proceeds in one chemical direction. This directionality has given rise to the convention that polynucleotide sequences are read from left to right in the 5- to 3- direction. The nucleotides that are used in the construction of nucleic acid chains are 5triphosphates of ribo- or deoxy- ribonucleosides. (3) Special enzymes called polymerases elongate RNA or DNA strands. These are the enzymes that make the phosphodiester bonds and their activity results in the production of a polymer of DNA or RNA, viz: BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 39 OF 153 (a) the enzymes that make more DNA from DNA are called DNA polymerases (Dpols); (b) those that copy RNA from DNA are called RNA polymerases (Rpols). (4) RNA polymerases can initiate a nucleic acid strand, but DNA polymerases can’t. A single RNA Polymerase: (a) can find an appropriate initiation site on duplex DNA, (b) bind the DNA, (c) separate the two strands in that region, and (d) begin generating a new RNA strand. As for DNA Polymerases: (a) they cannot initiate synthesis of a DNA chain; they can only elongate a pre-existing primer strand of DNA or RNA; (b) all DNA Polymerases catalyse nucleotide addition to the 3- OH end of the primer and thus direct growth in the 5- to 3- direction. The terminal 5- end of an RNA strand is chemically distinct from the rest of the strand. Unlike the nucleotides within the strand, the nucleotide at the 5- end retains all the PO4- groups of the triphosphate. When each additional nucleotide is added to the 3- end of the growing strand, only the PO4- is retained, whereas the and PO4- are lost. This bis-phosphate group is further cleaved to yield inorganic phosphate (Pi), with release of energy. 4.1 Replication of DNA Using the ultraviolet microscope, it was found that cells during interphase undergo cyclic changes in respect of their nucleic acid content. The uv microscope revealed that just before onset of mitosis, the amount of DNA is doubled. Furthermore, it was found that the total amount of DNA in the prophase and metaphase chromosomes is twice the amount found ordinarily. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 40 OF 153 These findings indicated that during interphase (prior to mitosis) not only do the chromosomes replicate but their genetic material, the DNA, does the same. 4.1.1 Modes of DNA replication Two or even three mechanisms (modes) of DNA replication were envisaged. These are: semi-conservative (according to Watson & Crick); conservative, and dispersive (according to Delbruck and Stent). Delbruck and Stent thought that the semi-conservative model was too simple. (1) Semi-conservative A daughter duplex contains one parental and one newly synthesised strand. (2) Conservative BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 41 OF 153 According to this model, the two original strands remain intact and the two strands of the daughter molecule are synthesised ex nihilo using available nucleotides. Thus the parental duplex is conserved. (3) Dispersive According to Delbruck and Stent, nothing is conserved, i.e. the parental DNA molecules were degraded and new DNA molecules were synthesised. Thus, the daughter duplexes consist of strands containing segments of parental DNA and newly synthesised DNA. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 42 OF 153 Five years after the physical DNA structure had been given, Matthew Messelson and Franklin Stahl set about to test which of the three modes of replication was in agreement with the Crick-Watson model of DNA. They confirmed the semi-conservative mode. 4.1.2 The experiment (1) Messelson and Stahl used a heavy isotope of nitrogen, 15N, to label the DNA of E.coli by growing cells in a medium in which the only nitrogen source was 15NH Cl. 4 A dozen generations of growth in such a medium is sufficient to label uniformly the entire bacterial DNA with 15N 15N. DNA molecules containing can be distinguished from DNA molecules containing the lighter, common isotope, 14N, on the basis of their densities, because DNA containing BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 43 OF 153 15N has a higher mass/nucleotide than the DNA containing 14N. DNA molecules of different densities can be separated from one another by centrifugation in a density gradient of a caesium chloride (Cscl) solution. (2) The bacterial cells from the 15N were then transferred to 14N culture medium. (3) Samples of these bacteria were taken immediately, and then after 1, 2, 3 and 4 generations in the new medium, and the DNA extracted. (4) These DNA extracts were then centrifuged in the Cscl gradient and the bands of DNA of different densities located by noting their absorption of uv light (see figure below) BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 44 OF 153 BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 45 OF 153 Inference The formation of the hybrid 15N/14N DNA molecules after one generation and their perpetuation are the key to the mode of DNA replication. Conclusion On the basis of their observations, Meselson and Stahl concluded that the DNA of intermediate density, formed after 1 generation of growth in 14N, is a hybrid molecule consisting of 1 fully heavy, conserved parental strand and 1 fully, newly synthesised daughter strand, exactly as predicted by the Watson-Crick model. 4.1.3 Single-Stranded DNA (ssDNA) Replication Certain small bacteriophages contain a circular, single-stranded DNA (ss DNA) as genome, e.g. Escherichia coli phages M13, fX174, M13, F1, and S13. Their replication can be divided into three steps: (1) conversion of the ss DNA genome to a double-stranded form, called the replicative form (RF); (2) multiplication of RF DNA by rolling circle replication; and (3) generation of ss DNA genome for packaging into phage from the RF DNA. Thus, replication of the ss DNA genome depends on the synthesis by DNA polymerase of a complementary strand! This replication also further confirms the semi-conservative mode of DNA replication. 4.1.4 Enzymology of replication The enzymes involved in the DNA replication process are templatedirected polymerases and others. (1) DNA Topoisomerases, viz: BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 46 OF 153 Type I: reversibly nicks one strand of DNA helix allowing the helix to swivel. These topoisomerases have both nuclease (strand cutting) and ligase (strand – resealing) activities. Type II (gyrase) – required to rotate the double helix and thereby relax it. (2) Helicase - unwinds short segments of the double helix of DNA, being aided by the single-strand DNA-binding proteins (SSBs). In so doing, it breaks the base pairing between the two strands of the parent DNA molecule. To do this it binds to single strand DNA near the replication fork and then move into it forcing the strands apart, i.e. unwinding the double helix. When the strands separate, SSB proteins bind, preventing reformation of the double helix. (3) Primase: Is a specific RNA polymerase; it synthesises short stretches of RNA (6-30 nucleotides long). Both lagging and leading strands need RNA primers, one for the latter and several for the former when replication commences. (4) Polymerases: Responsible for copying the DNA templates. They read the templates only in the 3- to 5- direction. However, they synthesise the new DNA strands in the 5- to 3- (anti-parallel) direction, one towards the replication fork and the other away from it. Prokaryotes have 3 (I, II, and III), and eukaryotes four (, , , and ). (5) Ligase Makes the final phosphodiester linkage between the 5- phosphate group on the DNA chain made by DP III and the 3- OH group made by DP I. The joining of these two stretches of DNA requires energy, which in humans is provided by the cleavage of ATP to AMP and Ppi. 4.1.5 The Process BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 47 OF 153 1. Origin of replication Replication always starts at a special site called replication origin. Eukaryotic DNA molecule has several origins of replication; prokaryotic DNA has only one. Having multiple origins of replication provides a mechanism for rapidly replicating the great length of the eukaryotic DNA molecules. A DNA that forms a complete replicating unit is often termed a replicon. In bacteria, the chromosome functions as a single replicon, whereas eukaryotic chromosomes contain hundreds of replicons in series. 2. Procedure DNA synthesis is bi-directional from a single point of origin (in prokaryotes with circular DNA), i.e. the replication forks move in both directions away from the origin. The enzyme gyrase (called topoisomerase II) nicks one strand and thus causes DNA relaxation. Each replicon contains a segment to which a specific RNA polymerase binds and a replicator locus at which DNA replication commences. DNA replication in a eukaryotic chromosome proceeds at a rate of about 50 nucleotides/sec, while in bacteria it is about 500 nucleotides/sec. The two strands are antiparallel yet both grow in a 5- to 3- direction. How is this possible? This is accomplished by the discontinuous synthesis of one of the daughter strand, and continuous synthesis of the other strand. The discontinuous synthesis occurs through the addition of short segments of DNA called the Okazaki fragments (100 to 1,000 nucleotides) complementary to such a parental strand. These short segments are formed on RNA primers. To link up these short segments, DP I, which has exonuclease properties chews away these primers and fills the gaps with BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 48 OF 153 relevant nucleotides. When all the ribonucleotides are replaced, DP I stops and dissociates from the new double helix. The final reaction needed to complete replication of the lagging strand is to join up adjacent Okazaki fragments, which are now separated by just a single gap between neighbouring nucleotides. All that is needed is to synthesise a phosphodiester bond at this position, a reaction catalysed by DNA ligase. The formation of RNA primers is an important key to the dilemma of replicating anti-parallel strands. 4.2 Synthesis of RNA 1. Introduction Although the information for the synthesis of all protein is located in the DNA, the DNA does not serve as a direct template along which the amino acids are laid down to form protein. Instead, it serves as a template for RNA formation. The copying of a DNA strand into RNA transcript is called transcription and is the first step of gene expression, an event that connects the genotype and phenotype. The first step in gene expression is the making of a disposable copy of itself. This expendable copy is either of the RNA types. This occurs by polymerisation of ribonucleotide subunits according to information on the DNA strand as follows: n(NTP) RNA of length n nucleotides plus n-1(ppi). Transcription occurs from specific, relatively short lengths of DNA. 2. Enzymes A transcription enzyme, RNA Polymerase (Rpol) performs the transcription in a fashion quite similar to replication. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 49 OF 153 Prokaryotes such as bacteria have only one Rpol type for synthesis of all RNAs (i.e. mRNA, tRNA, and rRNA). Eukaryotes (e.g. humans) on the other hand have 3 distinct Rpol types, viz: Type I for rRNA;Type II for mRNA; and Type III for tRNA. Chemical evidence confirms that transcription takes place on only one of the two DNA strands at a time (though not necessarily the same strand throughout the entire chromosome, i.e. it is asymmetrical). The DNA strand carrying the same base sequence as the RNA is called the coding strand or antisense strand; the opposite DNA strand, which acts as a template for transcription, is the anticoding or sense strand. The RNA is always synthesised in the 5- to 3- direction, i.e. the 5- end of an RNA is the beginning. This therefore means that the Rpol moves along the DNA in a 3- to 5- direction, locally separating the DNA strands of the helix and breaking the weak bonds between the base-paired nucleotides. The regulatory sequence of DNA nucleotides, i.e., the signal for initiation and strand selection during transcription is called the promoter site. Transcription also terminates at specific regulatory sequences (termination region). 3. RNA types There are three major ones: Ribosomal RNA (rRNA), Transfer RNA (tRNA), and Messenger RNA (mRNA). (1) Transfer RNA (tRNA) There is no direct interaction between a codon and the amino acid it represents. Instead, their association is mediated by tRNA molecules. (a) Structural features: BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 50 OF 153 i. Is clover-leaf shaped due to extensive intra-chain base-pairing. This figure exposes the anticodon loop at one end; ii. At the other end is the aminoacyl attachment site (i.e. the C-C-A-3OH); iii. There is at least one specific type for each of the 20 amino acids found in proteins; (b) Function: Functions as adaptor molecule that carries a specific amino acid to the ribosomal/mRNA complex; Together, tRNAs make up about 15% of the total RNA in the cell. (2) Ribosomal RNA (rRNA) Ribosomal ribonucleic acid (rRNA) is the RNA component of the ribosome, and is essential for protein synthesis in all living organisms. It constitutes the predominant material within the ribosome, which is approximately 60% rRNA and 40% protein by weight. Ribosomes contain two major rRNAs and 50 or more proteins. Molecules of rRNA are synthesized in a specialized region of the cell nucleus called the nucleolus, which appears as a dense area within the nucleus and contains the genes that encode rRNA Ribosomal RNAs of both prokaryotic and eukaryotic cells are synthesized from long precursor molecules called pre-ribosomal RNAs. There are three distinct size spp of rRNAs (23s, 16s, and 5s) in prokaryotic cells and four (28s, 18s and 5.8s and 5s) in eukaryotic cells. The S in the names of ribosomal subunits, and other macromolecular particles found in the cell stands for Svedberg units. This unit is named after Theodor Svedberg, a Swedish chemist who received the 1926 Nobel Prize in Chemistry for his research into disperse systems, (in this case, colloids of macromolecules BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 51 OF 153 dispersed in solution). Svedberg pioneered the use of ultracentrifugation to investigate the properties of macromolecules. Processing occurs as follows: A cut is made by ribonucleases between the 18s and 5.8s segments. The intron is removed and the two exons linked. They then associate with several proteins as components of the ribosomes as follows: (a) Eukaryotes (i) Small ribosomal subunit: 18s rRNA + 30 protein = 40s (ii) Large ribosomal subunit: 5.8 s +28s rRNAs + 50 protein = 60s A functional eukaryotic ribosome (80s) is then made up of a combination of 40s and 60s ribosomal subunits. (b) Prokaryotes (i) Large sub unit (50s) = 5s rRNA + 34 proteins + 23s rRNA; and (ii) Small subunit (30s) = 16s rRNA + 21 proteins. A functional prokaryotic ribosome (70s) is a combination of 30s and 50s. Together, rRNAs make up 80% of the total RNA in the cell. (3) Messenger RNA (mRNA) It carries genetic information from the nuclear DNA to the cytosol, where it is used as the template for protein synthesis. The mRNA constitutes only about 5% of the RNA in the cell. Once formed, it is processed through: capping, tailing and intron removal before export (for eukaryotes) to the ribosomes for translation. 4.3 Transcription in prokaryotes BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 52 OF 153 1. The enzyme (a) Is RNA polymerase (Rpol) for all forms of RNAs, (b) Has identical requirements as Dpol except that ribonucleoside triphosphates rather than the deoxyribonucleotides are its raw material. (c) It comprises five peptide subunits, 2 (alpha), 1 (beta), 1’(beta prime) and 1 (sigma) and is therefore called a holoenzyme. The 4 subunits (i.e. minus sigma) are responsible for the 5- to 3- RNA polymerase activity, and are referred to as the core enzyme. The sigma subunit enables the RP to recognize the correct point along the DNA template (i.e. the promoter) where RNA transcription is initiated. 2. The Process This may be divided into 4 stages, namely template binding, chain initiation, chain elongation and chain termination. (1) Template binding (i) Involves Rpol with DNA, (ii) The promoter regions along the sense strand are recognized by the sigma subunit of the holoenzyme. (iii) The initiation sites consist of < 50 deoxyribonucleotide residues and are rich in A - T base pairs, a factor that facilitates ease opening of the duplex. RP binds to any region of the duplex DNA. By binding loosely, releasing momentarily and then binding again, an RP explores the DNA. With the aid of the sigma factor, the enzyme recognises a nucleotide sequence (the promoter region) at the beginning of the DNA at which synthesis can begin. At this point, the holoenzyme binds tightly causing the duplex to open up and so allow transcription to begin. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 53 OF 153 (2) Chain initiation The residues of the promoter region are themselves not transcribed. Instead, the enzyme moves along the DNA molecule until it encounters the first 2 nucleotides to be transcribed. The ribonucleoside triphosphate complements (esp. ATP or GTP) are inserted and linked by a phosphodiester bond, resulting in chain initiation. (3) Chain elongation (i) Once the first two nucleotides have been linked, chain elongation proceeds rapidly in the 5- to 3- direction, anti-parallel to the DNA template strand. (ii) After about 10-12 ribonucleotides have been added to the growing RNA chain, the sub-unit is disassociated from the holo-enzyme and elongation proceeds at the rate of about 50 nucleotides / sec by the addition of NTPs to its 3- end. As in DNA synthesis, two high-energy bonds are used for the addition of each nucleotide. (iii) The elongation process produces a temporary DNA/RNA hybrid with the DNA unwinding in front of Rpol and closing and reforming behind it. Thus, we’ve a transcription bubble. (4) Chain termination (i) When the enzyme encounters a termination signal (a specific nucleotide sequence), synthesis is completed. This requires the co-activity of the termination protein, (rho) factor. (ii) The transcribed RNA is released from the DNA template, and the core enzyme dissociates. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 54 OF 153 (iii) This then marks the end of the 1st step in the information flow within the cell. NOTE: 1. Since there is no membrane barrier between the genes that are being transcribed and the cytoplasmic area where synthesis occurs, the transcription of RNA can proceed at the same time as its utilization. Thus, transcription and translation are coupled in prokaryotes. 2. No primer is needed to begin the new chain, only a duplex DNA. 3. Usually the transcript contains information for several polypeptides (i.e. is polycistronic) that are related. 4.4 Transcription in Eukaryotes Is similar to that found in prokaryotes and like in prokaryotes, there is no primer. However, it is far more complicated. Also, (i) Eukaryotes do not in general produce polygenic mRNA. Their primary transcripts are first "processed" into a mature mRNA prior to translation. (ii) They have 3 Rpols, of which type II is for synthesis of mRNA. (iii) Eukaryotic DNA is transcribed into rather long, heterogeneous pieces of RNA of between one to 30 gene lengths. These RNAs together are called heterogeneous nuclear RNA (hnRNA) or pre-mRNA. However, most of it (i.e. about 90%) is degraded before it exits from the nucleus. A typical hnRNA molecule has a 10 - minute life-time. 4.5 Post transcriptional modification of RNA RNA processing involves at least two nuclear events, modification of one or both ends of the primary transcript through the addition of nucleotides to BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 55 OF 153 the termini of some RNA chains, and removal of intervening noncoding sequences (IVs). These modifications are essential to efficient processing and subsequently, to translation. (1) Terminal modifications (i) 5- capping: The 5- end of pre-mRNA is modified by addition of a structure called a 7-methylguanosine cap. A guanine-containing nucleotide is added backward to the C5 (5- to 5-) where the triphosphate is found. This is catalysed by the nuclear enzyme guanylyl transferase. Then the G is methylated by guanine-7 methyltransferase. (ii) Poly-A tailing: At the 3- end, a sequence (40-200) rich in A bases, called poly (A) tail is added. This is thought to occur after RNA has been cleaved. Capping is mediated by the enzyme polyadenylate polymerase. (2) Excision of introns Maturation of eukaryotic mRNA usually involves the removal of RNA sequences, which do not code for protein from the primary transcript. The remaining coding sequences, the exons, are spliced together to form the mature mRNA, a process referred to as RNA splicing. Genes that contain introns and exons are commonly referred to as split genes. The mature mRNA is then shipped to the cytoplasm on its way to the ribosomes. 4.6 Split Genes A split gene (also called an interrupted gene) is a gene that contains sections of DNA called exons, which are expressed as RNA and protein, interrupted by sections of DNA called introns, which are not expressed. The DNA sequence in the exon provides instructions for coding proteins. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 56 OF 153 This type of genetic organization is typical of most eukaryotic genes and some animal viral genomes. Introns are rare in the DNA of prokaryotes. The existence of split genes was first suggested because of a lack of correspondence between the genetic map and specific mRNA molecules. For instance, the ovalbumin gene has 7,700 bps. After splicing, the mature mRNA has only about 1,872 bases translating to a protein of 674 amino acids. 4.7 Summary of differences in expression between prokaryotes and eukaryotes genes Prokaryotes Eukaryotes (i) All RNA species are synthesized by a single RNA polymerase; (i) Three different RNA polymerase are required (ii) mRNA is translated during (ii) mRNA processed before exit to transcription; cytoplasm by capping, tailing and RNA splicing (iii) Genes are contiguous segments (iii) Genes are often split. They are not of DNA that are colinear with the contiguous segments; rather the mRNA that is translated into coding sequences are interrupted by a protein. intervening sequences. (iv) mRNA is generally polycistronic; (iv) mRNA is monocistronic Topic Summary Following the acceptance of the two nucleic acids as the genetic material, we in this topic have learnt how they are synthesized so copies can be availed to daughter cells following cell divisions. We have learnt that following the unravelling of the physical structure of a DNA molecule, three modes of replication were suggested. These were: semi-conservative, conservative, and dispersive. It took the experiment of Messelson and Stahl to identify the BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 57 OF 153 correct mode by which DNA replicates. However, not all DNA is doublestranded. We have thus also learnt how this occurs if the DNA molecule is single-stranded. We have also learnt about the enzymes that participate in this process and how the process occurs at the molecular level. The formation of RNA from DNA is called transcription. We have studied how this occurs in eukaryotes and prokaryotes. In the case of eukaryotes, we have learnt also that the resultant RNA transcripts must be processed in the formation of the three kinds of RNA- mRNA, rRNA, and tRNA. The roles of the three types of RNA have also been reviewed. Further Reading (1) Russel, P.J. (1992). Genetics (3/e). HarperCollins Publishers, NY (2) Stine, G.J. (1989). The New Human Genetics. WCB Publishers, Iowa TOPIC ASSIGNMENT 1. What five enzymes are key to DNA replication? 2. What modes were suggested for DNA replication? Outline them. 3. Describe how Messelson and Stahl confirmed the correct mode. 4. The phage M13 has one strand of DNA molecule. How does it make copies of itself? 5. Given any three differences of gene expression between prokaryotes and eukaryotes. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 58 OF 153 TOPIC 5. THE GENETIC CODE, TRANSLATION, AND PROTEIN SYNTHESIS Introduction As stated in the previous topic, gene expression is a two-step process – transcription and translation. This topic covers the second part of this process. Therefore, in this part therefore, we consider how the instructions coded in the mRNA is turned into a chain of amino acids of a protein. At the conclusion of this topic, it is expected that the student has indeed understood the relationship between a gene and a phenotype, e.g. the ability of a gene to define a given phenotype. Topic Time To understand this topic well, four lecture hours are estimated to be sufficient. The student needs additional time to review those aspects than have not been sufficiently understood and attempt the assignment. Topic Learning Requirements There are no specific learning requirements to this topic. However, an advance study of the topic will make it easy to follow this topic. Learning Outcomes After successfully completing this topic, you should be able to: i. Define the genetic code and explain how it was broken and size determined ii. Define translation and describe the key steps in this process. iii. List and categorise components required for each step of protein BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 59 OF 153 synthesis iv. Answer any problem relevant to this topic Topic Content 5.1 The Genetic Code Because plants and animals produce an enormous variety of proteins, the question asked by 1960s researchers was "how does DNA, with only four bases, provide the instructions that allow for so many different arrangements of amino acids to form an almost infinite variety of protein?". This question was answered through the understanding the 4-letter genetic language of the DNA molecule- a phenomenon popularly called the "Cracking of the genetic Code" or the “deciphering of the genetic code”. This was in 1961 by Marshall Nirenberg and J.H. Matthaei. The discovery of mRNA and how it functions led to the solution of the genetic code. The genetic code is a dictionary that identifies the correspondence between a sequence of nucleotide bases and a sequence of amino acids. 1. Breaking the Code Cracking the genetic code was done using cell-free extracts from E. coli and synthetic mRNA (earlier made by Severo Ochoa using polynucleotide phosphorylase enzyme). This was done at the National Institute of Arthritic and Metabolic diseases, Washington, USA. All the necessary components for protein synthesis except mRNA were present in these extracts. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 60 OF 153 After incubating the mixture for 1 hour at 350C, Matthaei precipitated the proteins from the mixture using chloroacetic acid, washed it, and placed it in a radioactivity counter. Results: In the presence of “poly-U” nucleic acid but not in its absence, amino acids were incorporated into a material that could be precipitated in an acid medium, i.e. into proteins. Conclusion: The poly-U had led to the synthesis of a protein, entirely in vitro! This achievement was on 22nd May 1961. For the next 5 days Matthaei tried to determine which amino acid(s) was or were incorporated into proteins in the presence of poly-u. On the 27th May at 6 PM, he got the answer: the poly-u coded for a monotonous protein consisting of a chain of a single amino acid – phenylalanine. Thus, in less than a week Matthaei had identified the first “word” of the genetic code! Nirenberg, the head of the team finished characterizing the product of the experiment. In November 1961, in Proceedings of National Academy of Sciences, they published their work, presenting the results obtained with different RNAs including the poly-u molecule. These were that: (i). Bacterial extract plus mRNA: AAA, AAA, AAA = Lys-Lys-Lys protein; (ii). Bacterial extract plus mRNA: CCC, CCC, CCC = Pro-Pro-Pro protein; (iii). Bacterial extract plus mRNA: GGG, GGG, GGG = Gly-Gly-Gly protein. Following this lead, but using mixed-base synthetic mRNA polymers, Nirenberg and colleagues were able to decipher the complete code by 1966. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 61 OF 153 Since the advent of gene cloning and DNA sequencing, it is now a straightforward operation to work out the genetic code. It involves comparison of a protein sequence determined by chemical and enzymatic techniques with its DNA sequence. 2. Size of word for an amino acid Having resolved the sequence of the nucleotides, the next challenge was to determine the number necessary to code for the placement of one amino acid into the protein chain. Nirenberg and other Scientists reasoned that because there were 4 different nucleotides in DNA, and the 4 had to be used in some combination in order to code for all 20 naturally occurring amino acids. Thus they established that: i. the genetic code is not singlet, ii. the genetic code is not doublet, but iii. the genetic code could be triplet, i.e. a combination of 4 x 4 x4. This result is 64 triplets comprising both sense and nonsense triplets. The sense triplets are 61 and code for amino acids, whereas the other three are stop triplets. These tRNAs do not carry amino acids to the growing peptide chain, but instead terminate translation causing the release of the newly synthesised protein and disassembly of the translational machinery. The 3base (triplet) sequence in mRNA was thereafter designated a codon. In 1968, M. Nirenberg, H.G. Khorana and R. Holley shared the Nobel Prize for their work in "Cracking the genetic code". They had explained “how a gene controls the arrangement of amino acids into a specific protein”. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 62 OF 153 Because the code comes into play only during the translation part of gene expression, i.e. during the decoding of mRNA to polypeptide, geneticists usually present the code in the RNA dialect of A, G, C, and U. 5.2 Translation Translation is the final step on the way from DNA to protein. It is the synthesis of proteins directed by a mRNA template. The information contained in the nucleotide sequence of the mRNA is read as three letter words (triplets), called codons. Each word stands for one amino acid. Once mRNA makes its way to the cytoplasm, it proceeds through the three phases of translation – initiation, elongation, and termination. 1. Components In order to decode the information held within the mRNA and to turn it into a protein, two other species of RNA are required; tRNA (necessary for the transport of amino acids to the ribosome, where mRNA is positioned to produce a protein), and rRNA (an integral part of the ribosome, holds the entire complex of mRNA, tRNA and ribosomes together). 2. The Process The synthesis of proteins requires that the instructions in mRNA be interpreted into an ordered list of amino acids that are assembled into a particular protein (polypeptide chain). The many components involved are below): Component Function (i) Ribosomes workbench for translation. (ii) tRNA adaptor molecule. (iii) Amino acids building blocks of protein. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 63 OF 153 (iv) mRNA template. (v) Aminoacyl synthetase charging tRNA. (vi) Initiation factors mediate the various binding steps of the initiation process. (vii) Peptidyl transferase elongation and translocation of growing polypeptide. (viii) (ix) Ions: Mg++, K+ ATP stabilize conformation of ribosomes. energy source, essential for charging tRNA. (x) GTP energy source for translocation. Translation is therefore a chemical reaction that occurs in steps beginning with charging tRNAs, then chain initiation, chain elongation and chain termination. (1) Activation of amino acid or charging a tRNA (also called aminoacylation) Each amino acid (AA) is attached to a tRNA molecule that is specific to that amino acid by a high-energy bond derived from ATP. Catalysis is by a synthetase. There is a separate synthetase for each amino acid. Then we can speak of a "charged" tRNA. synthetase1 aa1 + tRNA1 + ATP aa1-tRNA1 + ADP. The energy of the charged tRNA is converted into a peptide bond linking the amino acid to another on the ribosome. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 64 OF 153 Note: Upon activation, an amino acid is thermodynamically capable of being efficiently used for protein synthesis. In all protein syntheses, the first amino acid to be attached to its tRNA is methionine (Met). The tRNA is tRNAmet and the enzyme for this is methionyltRNA synthetase. Only methionyl-tRNAimet (i.e. methionine attached to tRNAimet) can directly enter the P site of the ribosome and bind to a small ribosomal subunit to begin the process of protein synthesis. Whereas all bacterial polypeptides start with N-formyl methionine, it is just methionine in eukaryotes. There are at least two tRNAs for methionine: i. tRNAimet for initiation of protein synthesis, and ii. tRNAomet, incorporates methionine to a growing polypeptide chain. (2) Initiation AUG is the initiation signal in mRNA. Initiation is a 3-step process: Step 1: Attachments of: one, IF3 to a free 30s subunit; two, initiator tRNA (i.e. f-met-tRNA) to an IF2-GTP; and three, the union of the whole complex. Step 2: The resulting IF3-30s-f.met-tRNA-IF2-GTP complex (now called the initiation complex) then binds to mRNA, a process that may also require IF1. This occurs at a site that is located upstream near the AUG codon. Bacterial mRNA can have several AUG initiation sites, but eukaryotic mRNA molecules almost always have a single functional AUG near its 5- end. Step 3: Lastly, the large ribosomal subunit (50 or 60s) then joins the complex. The bound GTP is hydrolysed to GDP and inorganic P, and the initiation factors (IFs) detach. The Met-tRNAimet bearing the first amino acid BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 65 OF 153 is now bound to the ribosome at the P site. The initiation complex is ready to begin synthesis of the peptide chain. In E.coli every protein is initiated by Nformylmethionine. It is non-formylated methionine in higher organisms. (3) Elongation The stage in which the mRNA is decoded and the polypeptide is synthesised. Ribosomes use two tRNA-binding sites, ‘A’ and ‘P’, during protein elongation. The P-site is the peptidyl site, and the A-site is the aminoacyl site. For the peptide chain to begin to grow, a second amino acid that is correctly bound to its tRNA must be brought into its proper position on the ribosome. The ‘A’ site accommodates the incoming aminoacyl-tRNA that is to contribute a new amino acid to the growing chain. The attachment of aa-tRNA precursor to ribosome is a complex reaction. It starts when one of the elongation factors (ef-Tu) reacts with GTP and aatRNA to form an aa-tRNA-GTP-ef-Tu complex. This complex then transfers its aa-tRNA component to its ‘A’ site with release of a free ef-Tu-GDP complex and P. The ef-TS is responsible for the dissociation of GDP from efTU to allow another GTP to bind and to allow the cycle to continue. The P-site contains the peptidyl-tRNA complex, i.e., the tRNA linked to the amino acids that have sofar been added to the chain. The general formula for an amino acid is H2N – CHR – COOH, in which the R group can be anything from a hydrogen atom (as in the amino acid glycine) to a complex ring (as in the amino acid tryptophan). BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 66 OF 153 A peptide bond is formed through a condensation reaction that involves the removal of a water molecule, for example between the -COOH group of amino acid 1 (aa1), methionine and the amino group of the next amino acid (aa2), e.g. phe-tRNAphe, now in the A-site to form the peptide methionylphenylalanyl-tRNAphe . This is catalysed by an enzyme activity within the ribosome called peptidyl transferase. The tRNA, which brought the f.met to P-site needs to be removed. This is done by a 2nd enzyme activity within the ribosome, tRNA deacylase. This breaks the bond between the amino acid f.met and the tRNA freeing the tRNA so it may be recharged and recycled. This is followed by the ribosomal translocation. In translocation, the ribosome shifts down the mRNA by one codon, moving the tRNA carrying the growing polypeptide chain to the P site, and moving the next mRNA codon into the A site. The average rate of amino acid addition is about 5 amino acids / sec. It has been found that in E. coli, a protein 300 - 400 amino acids long is made in 10 - 20 seconds. (4) Termination Two conditions are necessary for chain termination: a. The presence of a codon in the A site (i.e. UAA, UAG or UGA) that specifically signifies that polypeptide elongation should stop. b. The presence of a GTP-bound release factor (rf = TF), which reads the chain-terminating signal. When any of the stop codons are recognised by the termination factors (TFs), the peptidyl-tRNA complex is released. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 67 OF 153 Almost simultaneously the complex divides into uncharged tRNA molecule and a newly completed protein that can either assume its final shape or combine with additional protein subunits. After releasing its peptidyl-tRNA, the ribosome disengages from the mRNA and divides into two subunits, where upon it is ready to start the whole cycle over again. Topic Summary In this topic, you have learnt how a gene product is realized. For a simpler understanding of this process, we needed to understand the genetic codeboth its size and what ‘letters’ spell which ‘word’. This enabled us to discuss the process of translation (protein synthesis). In prokaryotes, we saw that transcription and translation are coupled, that is occur simultaneously but in eukaryotes translation occurs in the cytoplasm while transcription takes place in the nucleus. Translation was noted to be a 4-step process involving aminoacylation, chain initiation, elongation and termination. Thus, we saw how a gene expression led to a protein which has a bearing on the phenotype. Further Reading 1. Russel, P.J. (1992). Genetics (3/e). HarperCollins Publishers, NY 2. Suzuki, D.T., Griffiths, Miller, J.H. and Lewontin, R.C. (1986). An Introduction to Genetic Analysis. (3/e). W.H. Freeman and Co. NY TOPIC ACTIVITIES Use this questions to test your understanding of this topic. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 68 OF 153 1. Explain briefly how the genetic code was ‘cracked’. 2. Give any three characteristics of the genetic code. 3. Give an outline of the process of translation. 4. If a gene size is 720 base pairs. How many amino acids could the resultant protein comprise assuming a codon each for start and stop signals? BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 69 OF 153 TOPIC 6. MECHANISMS OF CONTROL OF GENE EXPRESSION Introduction All organisms have many genes that perform various intricate activities in them through their gene products. The process of gene expression relates the gene to is phenotype. The challenge that faced the early molecular geneticists was to understand whether all the genes in an individual functioned all the time. Was gene action under any control and if so, how was this operationalised? The answer was provided through insightful experiments by some scientists at the French school led by Pardee, Jacob and Monod in the 1960s. Topic Time Due to the newness of this topic to most students a slow incisive understanding is vital for any grasp to be made. Thus, three lecture hours is mandatory. However, optional time for a revisit to the content and tackling the assignment is strongly encouraged. Topic learning requirements Prior reading of related content is strongly advised if you are to follow and easily understand this topic. Learning Outcomes After you successfully complete this topic, you should be able to: i. Demonstrate that gene expression is indeed a regulated process ii. Cite at least four examples of gene regulation in your surrounding iii. Define an operon and name several examples thereof iv. Demonstrate how Jacob and Monod showed that gene regulation in organisms occurs BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 70 OF 153 v. Demonstrate how mutations in an operon may interfere with its normal function vi. Answer any questions relevant to this topic Topic Content 6.1 Introduction Control of gene expression, is in essence control over the amount of gene product present in the Cell. Q: How can the synthesis rate of a gene product be regulated? A: By exerting control over anyone, or a combination, of the various steps in the gene expression pathway. The possible points are: 1. Transcription, 2. mRNA degradation, 3. mRNA processing, 4. Translation – control could be exerted over the number of ribosomes that can attach to a single mRNA, or over the rate at which individual ribosomes translate a message. However, the best understood mechanisms, in both prokaryotes and eukaryotes, depend solely on the first possibility, that is, regulatory control over transcription. 6.2 Gene and protein types in bacteria These can be categorized into structural and regulatory genes and proteins. 1. Genes (1) Structural genes BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 71 OF 153 These encode the largest group of cellular proteins, e.g. enzymes, membrane proteins and ribosomal components. They do not regulate transcription but are themselves regulated. (2) Regulatory genes Code for proteins that help the cell sense its environment and regulate transcription of structural genes by binding to DNA. Regulatory proteins can be divided into negatively acting and positively acting. a. Negatively acting proteins (also called repressors) Repress a gene or an operon by binding to DNA at or near a promoter site (usu. Operator) and physically preventing mRNA synthesis by denying RNA Polymerase access to that gene or operon. They combine with effectors thereby preventing them from the operator sites. Effectors are either inducers like lactose or co-repressors like tryptophan. (i) Inducers: The binding with an inducer produces a change in the shape of the repressor protein, and in this way decrease its binding affinity for the operator locus; (ii) Co-repressors: Co-repressors combine with repressors (that usu. are not functional without the co-repressor) to become functional. For example, the amino acid tryptophan in the tryptophan operon) combines with trp repressor protein and only then can the repressor protein bind to the operator and inhibit transcription of the genes for the enzymes that make tryptophan. Thus, the amino acid is acting as a co-repressor, and the addition of tryptophan to an E. coli culture stops the synthesis of the tryptophan-producing enzymes. b. Positively acting proteins (also called activators) BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 72 OF 153 These bind to DNA at or near a promoter site and increase the efficiency with which RNA Polymerase binds to the promoter. The sites to which they bind are called activator sites. Example is the Ara C protein. Ara C is necessary in the transcription of genes responsible for arabinose utilization. 6.3 Purpose of Gene Regulation Gene regulation prevents a cell from wasting energy by making unnecessary components. For instance: 1. Bacteria By regulating expression of their genes, E.coli and other bacteria are able to respond quickly to changes in their environments without wasting energy by maintaining in an active state, genes whose products are of no immediate use. 2. Eukaryotes Gene regulation in eukaryotes is similar to that in prokaryotes, but is more sophiscated to both the signals and the impact gene regulation has on the organism, viz: a. Eukaryotic cells respond to a greater range of regulatory stimuli, e.g. plant cells switch on genes for photosynthetic proteins in response to light. b. Some eukaryotic genes are developmentally regulated, e.g. human globin genes – different members of the and gene families being expressed in the embryo, foetus and adult. c. Gene regulation in multicellular organisms results in cell specialization. In humans there are about 250 different cell types, each with a different morphology, biochemistry and role to play in the organism. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 73 OF 153 6.4 Understanding Gene Regulation Studies using bacterial cells provide the foundations of our understanding of gene control. In bacteria, regulatory proteins that bind to specific sites on DNA carry out most gene control. The binder either prevents or increases the synthesis of specific mRNA. 6.5 Prokaryotic Gene Regulation It is common in bacteria for one promoter to serve a series of clustered genes and such genes often produce enzymes that are active in a single metabolic pathway. Such a gene cluster constitutes the operon and all transcribe into a single polycistronic mRNA (i.e. the mRNA encodes more than one polypeptide). Each gene of the operon is represented in the mRNA, and each section of the mRNA is independently translated as shown below. DNA strand P Gene A Gene B gene C transcription 5’ AUG UGA AUG UGA AUG Ribosomal subunits attach to each initiation site BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 74 OF 153 UGA translation Peptide A peptide B peptide C 6.6 Operons Through their investigations on transcription, translation and protein synthesis of E. coli, Francois Jacob and Jacques Monod (1950s to 1960s) provided a definite answer to the question of gene regulation in the living system. This concept came to be called the Lactose operon and was the first documented system of gene regulation. Jacob and Monod received the Nobel Prize for their operon theory in 1965. An operon is a unit consisting of adjacent cistrons that function coordinately under the control of an operator gene. 1. The lactose Operon Jacob and Monod predicted, based on genetics studies on the lactose genes, that the transcription of the lactose gene DNA into RNA was controlled by 3 classes of genes: * A regulator gene (I), * An operator gene (O), and * One or more structural genes In this operon, gene Z is responsible for the production of (beta) βgalactosidase, an enzyme that cleaves lactose into glucose and galactose. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 75 OF 153 I P O Z Y A mRNA transacetylase repressor protein permease β-galactosidase Key: Gene I = produces a repressor protein that binds to the operator; Gene O = the operator and is the one repressed by the protein from gene I; Genes Z, Y and A = produce the enzymes galactosidase, permease and transacetylase respectively. In the wild E. coli, these genes are in a repressed state, i.e. no enzymes are formed. If glucose and lactose are present, glucose is preferentially used while the operator remains shut down. However, when glucose is used up, the lactose molecules interact with the repressor protein, stopping this protein from repressing the operator gene function. The unrepressed O gene allows the attachment of RNA Polymerase, and transcription begins throughout the length of the O and structural genes. In this system therefore, we have an active repressor plus inducer = inactive repressor which then permits structural genes to produce mRNA. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 76 OF 153 2. Faults in the Lactose Operon Mutations in the I and O genes have been found that alter the control of the operon. Mutations in Z can make normal I and O genes useless with respect to control of the production of a normal protein because the transcription of the structural genes to produce the protein is stopped at the mutant Z gene. Possible mutations that may alter the suppression and transcription of this operon: I O Z designate normal operon: inducible enzyme; no transcription; repressor Protein made. I- O Z, an operon with a mutant regulator gene and therefore no repressor protein made: constitutive enzyme continuous transcription occur. I Oc Z, an operon with a mutant O gene and therefore whereas the repressor protein made but will not attach to operator site: Thus, constitutive enzyme transcription (i.e. continuous transcription) occurs. I O Z-, an operon in which a mutant structural gene for galactosidase has occurred. Therefore, no transcription due to structural defect will occur. The other example of negative control is the tryptophan operon. This system as already stated operates on the basis of a repressed system, i.e. an inactive repressor plus co-repressor results in an active repressor which then prevents structural genes from producing mRNA. 6.7 Negative and Positive Controls of Transcription BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 77 OF 153 1. Negative Control The lactose operon is an example of this kind of control in the sense that this operon is potentially operative and active but is actually switched off by the regulator gene. Thus, this operon is not permitted to express itself unless actually required. The product of the regulator gene represses the enzyme synthesis. When the repressor is absent, protein synthesis occurs even without the aid of the inducer 2. Positive control (Activation) A case where the product of the regulator gene is needed for the initiation of transcription. An example is the Ara C operon. When E. coli cells are grown on arabinose as an energy source, they produce the three enzymes needed to convert arabinose into xylulose (which then enters the glycolytic cycle). These three enzymes, an isomerase, a kinase and an epimerase, are the products of three genes of this operon. They are produced to metabolise arabinose. These three genes (B, A and D) are regulated by the ara regulatory protein encoded by the ara C locus. In the absence of arabinose, the protein behaves as a repressor, binding to the "O" DNA preceding the ara B, A and D genes. Addition of arabinose to the medium results in the release from the operon of the ara repressor protein, which binds to arabinose. This newly formed complex then becomes an activator, attaching to a separate initiator site on the operon, stimulating RNA Polymerase binding and transcription of the B, A and D genes. Thus the ara regulatory protein can alternatively behave as a repressor or an activator, depending on the absence or presence of arabinose, respectively.Extension of the operon-regulator gene theory may account for some aspects of cellular differentiation. It is believed that all cells of eukaryotes contain the entire genome of the organism, but much of BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 78 OF 153 it is repressed. Differentiation is thought to be the result of the programmed de-repression of different operons. 6.8 Other bacterial operons: 1. Galactose operon. This is a set of genes, E, T, and K producing enzymes epimerase, transacetylase and kinase respectively needed to convert galactose into glucose. 2. Arabinose operon This is a set of genes ara B, ara A, and ara D producing enzymes isomerase, kinase and epimerase respectively needed to convert arabinose to xylulose. 3. Tryptophan operon This is an operon comprising five genes required for the synthesis of the amino acid tryptophan. This operon is an example of a repressed system. The induction and repression of bacterial enzymes occur mainly through the control of the transcription of the appropriate genes. Topic Summary In the previous topic, we learnt that genes express themselves in a two-step process leading to formation of gene products. It is these gene products that influence the function and phenotypes of organisms. The question that arose subsequently was whether all genes of an organisms express themselves or do so only when required. In this topic, we have learnt that gene expression is a highly regulated process, thanks to the work of Francois Jacob and Jacques Monod. This breakthrough was achieved in bacteria by studying how the Lactose operon. We also considered the levels at which gene regulation BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 79 OF 153 can be instituted, e.g. at transcription, translation, etc. Mutations in this bacterial system alters the function of gene regulation. Further Reading (1) Russell, P.J. (1992). Genetics (3/e). (20 Stine, G.J. (1989). The New Human Genetics. WCB Publishers, USA. Topic Activities 1. What is the purpose of gene regulation in organisms? 2. At what points can gene activity possibly be regulated? 3. If you were to classify genes and proteins, how would you do so? 4. Define ‘operon’ and give possible examples in bacteria. 5. name any familiar examples of gene regulation that you are familiar with. 6. Discuss the operation of the ‘Lactose operon’. 7. What are the possible points of mutation in lactose operation and their effect. TOPIC 7. GENE MUTATIONS AND REPAIR In the foregoing topics we have examined or considered the gene or DNA as a normal entity and therefore there was great accuracy in their expression. The existence of alternative phenotypes indicates changes do occur to the gene and these changes have significant consequences to the organism both in its function and phenotype. This variation can occur through either recombination or mutation. In the latter case, how does this occur and when BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 80 OF 153 it does occur, is there provision for restoration? This is the gist of this topic. Learn and appreciate its significance. Topic Time It is envisaged that four lecture hours are sufficient to slowly and effectively learn this topic. Any additional time goes to further review and assignments. Topic Learning Requirements In BOTA 111, you covered preliminaries of this topic. So, a revisit to the relevant notes to this topic is strongly encouraged. Learning Outcomes After you have successfully completed this topic, you should be able to: i. Define the two possible sources of heritable variations in an organism ii. Identify the possible types of gene mutations iii. Describe the sources of spontaneous and induced mutations iv. Describe at least one DNA repair mechanism available to the organism v. Answer any question relevant to this topic Topic Content General Introduction Types of variation in a species These are: environmental, genetic, or a combination of both environmental and genetic. Changes, which are independent of environment and are heritable, provide a permanent step on which selection can be made. Such changes occur in the genome of an individual and may be caused by recombination and mutation. Recombination usually causes no marked variations because it BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 81 OF 153 merely redistributes existing genetic material among different individuals. Mutation, on the other hand causes a marked variation as it affects changes in the alleles (genes) and/or chromosomes. Usually they can be heritable or non-heritable. 7.1 MUTATION Introduction Biological evolution occurs because the hereditary material, DNA, can change from generation to generation. These changes affect nucleotide sequence or amount of DNA and can occur spontaneously or by induction. Hereditary information is usually maintained intact by a complex metabolism involving both replication and repair functions. Thus, mutations may be the result of errors in these processes. Mutations provide species with different forms of the genetic material, and also they are the working tools of the geneticist. The resulting phenotypic variability allows the geneticist to identify and study the genes that control the modified traits. Even though mutations are known to be harmful, there is little that can be done to stop their occurrence. As mutations accumulate in a species, their burden, called the genetic load increases. One can classify mutations either as point or gross. In this topic we consider point or gene mutations. 7.1.1 Point mutations Also called gene mutations, affect one or a few nucleotides within a gene. They show a tendency to back mutate, i.e. revert to the wild-type. TYPES They can be divided into two general classes: base-pair substitutions (<20% of species mutations) and frameshift mutations. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 82 OF 153 1. Substitutions. These are mutations that involve alteration in the sequence but not the number of nucleotides in a gene. a. Transitions – Caused mainly by tautomerization, deaminations and base analogs. They involve replacement of a purine by a purine (G-A or A-G) or a pyrimidine by a pyrimidine (C-T or T-C). b. Transversions. Here, a purine replaces a pyrimidine (T or C changes to A or G; A or G replaces T or C). Obviously these mutations cannot be seen but their effects occur in the proteins formed. At protein level these mutations are manifested as: a. Silent mutation: where the triplet codes for same amino acid, e.g. AGG CGG, both code for Arg. b. Neutral mutation: triplet codes for different but functionally equivalent amino acid, e.g. AAA AGA, changing basic Lys to basic Arg at many places will not alter protein function. c. Sense mutation: this leads to proteins that are longer than normal by changing a termination codon to one that codes for an amino acid. d. Missense mutation: triplet codes for different and non-functional amino acid. E.g. -globin: Hb A Hb S Codin: CTC CAC Codon: GAG GUG Amino acid: glu val e. Nonsense mutation: this codon does not code for the incorporation of any amino acid, but behaves as amino acid chain terminator. It may arise within a sense codin, e.g. ATG (UAC) (Tyr) ATT (UAA) (nonsense triplet). A mutation that affects a nonsense codon serves to elongate the protein being formed. While those mutations imply a forward occurrence, reverse mutations can also occur as follows: BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 83 OF 153 a. Exact reversion. forward reverse AAA (lys) GAA (glu) Wild type AAA (lys) mutant wild type b. Equivalent reversion. Forward reverse UCC (Ser) UGC (cys) Wild type AGC (ser) mutant wild type Note: a. Not all point mutations are observable, b. All point mutations do not occur only in coding regions but can occur in other areas of the gene, e.g. regulatory sites and are therefore called regulatory mutations. 2. Frameshifts. These involve insertion (addition) or deletion of nucleotides from the DNA that are not a multiple of three. Insertions are induced by treating cells with acridine derivatives while deletions result from hydrolytic loss of a purine base because of low pH or temperature by alkylating or deaminating agents. Frameshifts cause all DNA beyond the point of mutation to be misread. They are also usually lethal. Whereas in nucleotide substitutions only one amino acid in the protein is altered, in frameshifts, the addition or deletion of a single base can cause large-scale changes in amino acid composition of the polypeptide chain leading to a non-functional gene product. Example: Haemoglobin : Position 138 139 140 141 Termination codon UCC AAA UAC CGU UAA amino acid ser lys tyr arg BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 84 OF 153 A deletion in last base of codon 139 produces a frameshift thus: codon UCC AAU ACC GUU AA amino acid ser asn thr val - 7.1.2 Spontaneous vs Induced Mutations 1. Spontaneous mutations These arise in nature by chance due to natural forces such as radiation from cosmic rays and radioactive mineral. They include: a. Errors in DNA duplication, b. Spontaneous lesions, and c. Transposable genetic elements, e.g. some phages, transposons, etc. a. Errors in DNA replication These result from an illegitimate nucleotide pair (e.g. A – C) being incorporated into a newly synthesised strand. This is however of a very rare occurrence, i.e. at a frequency of between 108 and 1012 bases. Where it occurs, the DNA polymerases (DPs) correct this anomaly. All prokaryotic DPs have exonuclease activity. Spontaneous mutations may also occur during DNA replication because of tautomerism, i.e. a shift in the position of a proton that changes the chemical properties of a molecule. The nitrogen bases exist in normal forms: keto (C=O) for G and T; and amino (NH2) for A and C. The rarer forms are enol (COH) and imino (NH) respectively as below. Base Normal imino (NH) Adenine amino = NH2 A* Cytosine amino = NH2 C* BOTA 413: MOLECULAR AND MICROBIAL GENETICS enol(COH) PAGE 85 OF 153 Guanine keto = C=O G* Thymine keto = C=O T* Tautomeric shifts in the bases change the hydrogen bonding properties such that Adenine assumes those of Guanine, Guanine those of Adenine, Cytosine those of Thymine, and Thymine those of Cytosine. Tautomers cause transitions, e.g. A-T may become G-T leading to G-C mutant (below). The base analog mutagen 5-Bu, an analogue of Thymine exerts its mutagenic activity through tautomerism. A:T 5-Bu A : 5 Bu (k) 1st replication A:T G : 5 Bu (e) 2nd repl. A:T A:T G : C (mutant) A : 5 Bu (k) b. Spontaneous lesions Types These are largely two, depurination and deamination. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 86 OF 153 Depurination: Is the more common of the two. It involves the interruption of the glycosidic bond between base and sugar and subsequent loss of G or A from the backbone. A mammalian cell spontaneously losses about 10 3 purines from its DNA during a 20-hr generation period at 370C. If these lesions were to persist, they’d result in significant genetic damage, since during replication the resulting apurinic sites cannot specify a base complementary to the original purine. Deamination: Is the loss of an amino group (-NH2) due to oxidative deamination of the bases. The bases A, G and C have NH2 groups. The –NH2 group is replaced by a hydroxyl (-OH) group, viz: (i) A is deaminated to Hypoxanthine (H), which pairs with C, as below A:T Rare tautomer H:T H:C A:T (normal replication) keto tautomer H:C G:C mutant Thus, due to this act a base pair such as A:T changes to G:C, a transition. (ii) G to Xanthine, which pairs with C (no change); (iii) C to uracil, which pairs with A during replication. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 87 OF 153 Luckily, enzyme systems exist within the cell to restore such damage. For instance, the enzyme U-DNA glycosylase, recognises U residues in the DNA and excises them, leaving a gap that is subsequently filled in. 2. Induced mutations A mutation is induced when it is caused by the application of a mutagen. The mutagens used are either physical or chemical. Chemicals include: Industrial chemicals used in paints, solvents, cleaning agents, etc.; derivatives of petroleum compounds; reagents in pesticides; food derivatives, and various drugs. Radiation and its sources such as nuclear armament testing (fallout), and nuclear power plants, naturally from 40K, 14C. Action: Mutagens induce mutations by at least three different mechanisms: a. Replacement of a base in DNA (base analogs), b. Alteration of a base so that it specifically mispairs with another base (alkylating agents), and c. Damaging a base so that it can no longer pair with any base under normal conditions (deaminating agents). Incorporation of base analogs Base analogs are mutagenic chemicals whose structure so mimics that of a naturally occurring base that they’re incorporated by a cell into DNA. Examples are 5-Bu (which is similar to thymine), and 2-Ap, similar to adenine. Once in place, these bases have pairing properties unlike those of the bases they replace and thus can produce mutations by causing insertions of incorrect nucleotides opposite them during replication. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 88 OF 153 Specific mispairing Both alkylating and deaminating agents can cause specific mispairing. Both modify side groups of bases in the DNA in situ. a. Alkylating agents: Include nitrogen and sulphur mustards, methyl methane sulfonate (MMS), ethyl ethane sulfonate (EES), ethyl methane sulfonate (EMS). They cause the addition of methyl, ethyl or (CH3, C2H5, C3H7) groups. This results in a modification of the bases’s pairing potential and thus mispairing. They generally cause transversions. Once they are modified the bases now will pair with non-traditional partners. For example, EMS adds an ethyl group to 6th position of G to result in 6ethylguanine. This then acts as a base analog of A causing it to pair with T. This leads to direct mispairing with T and results in a G: C to A: T transition at the next replication. EMS can also alkylate the keto group in the 4th position of T. b. Deaminating agents include nitrous acid and formaldehyde. Three of the four natural bases in DNA (A, G, C) have NH2 groups that can be removed by nitrous acid. The products of the reaction and their pairing properties are: A to H, which pairs with C; G to X, which pairs with C; C to U, which pairs with A, and 5-m C to T, which pairs with A. Another agent, hydroxylamine (HA or NH2OH) causes G-C to A-T transitions only. It acts by adding a hydroxyl group to its existing amino group. The new product, hydroxylaminocytosine, may undergo a tautomeric shift allowing it to pair with A. Following two replications, a G-C pair will be converted to an A-T pair. In conclusion, we may now summarise the kinds of damage to DNA and possible causes as below. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 89 OF 153 Summary of the general damages to DNA and their causes DNA lesion cause a. Missing base acid and heat remove purines; b. Altered base ionizing radiation, alkylating agents; c. Incorrect base spontaneous deaminations: C to U; A to H, etc; d. Deletions, insertions intercalating agents, e.g. acridine dyes+; e. Cyclobutyl dimers uv radiation; f. Strand breaks ionising radiation, chemicals; g. Cross linking of strands light induced or due to antibiotics, e.g. mitomycin c; + Acridines have a structure that allows insertion or intercalation between bps along the DNA strand. It leads to outstretching of the DNA backbone thereby causing a deletion or an addition of a nucleotide pair. 7.2 DNA REPAIR From the foregoing, it is evident that DNA molecules may be damaged by a variety of agents and this damage must be repaired to prevent accumulation of harmful mutations. If left uncorrected, both growing and non-growing somatic cells will suffer genetic damage and no longer function. Moreover, DNA in germ cells with too many mutations will lead to unviable offspring. Repair is done by a number of genetically controlled systems. 1. Mutation rates The fact that the level of mutation is kept low is due to the efficiency of DNA polymerase and the repair mechanisms. For instance, single breaks of DNA occur with a frequency of 2.3 x 102 hr-1. Repair occurs at 2.0 x 105 hr-1. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 90 OF 153 Estimates put mutation rates per cell in bacteria and viruses at 10-3 to 10-4 and in humans at 10-5 (i.e. 1:100,000 gametes). However, the determined value for humans is 4: 100,000 gametes. It appears very low. But when we consider that male ejaculates 300m sperm, we see that amongst them are 12,000 mutants! 2. Repair mechanisms Besides enzyme-based repair mechanisms, DNA by virtue of its double stranded nature repairs itself. If mutations are left uncorrected in both growing and non-growing somatic cells, the damage will be such that they are rendered unable to survive. Thus for survival, all such mutations must be corrected. DNA can therefore be restored by any of the aforementioned mechanisms as follows: a. Natural DNA repair By virtue of its double helical structure, nature has taken this advantage of DNA to evolve processes that use the redundancy of the two polynucleotide chains to raise enormously their stability as information carriers. The principle of error detection and correction operates very efficiently. b. Proof reading This is done by the polymerase enzymes. Proof reading helps to nullify errors of DNA copying. They demand accurate base pairing in the copying of a DNA strand. All prokaryotic polymerases (I, II, III) have a 3to 5- exonuclease activity (but only I and III also have a 5- to 3- activity inbuilt in them). This gives them the ability to excise any incorrectly placed base. Therefore, this proof reading property of DNA Polymerases results in low mutation frequencies. c. When DNA is damaged by ultra-violet (uv) light. This mostly affects the DNA of lower organisms such as bacteria (e.g. E. coli). BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 91 OF 153 Several ways of restoring such damage exist: photo reactivation, excision, recombination, and systems of survival (sos). When DNA absorbs UV light, neighbouring pyrimidine bases can become chemically rearranged and covalently linked into a pyrimidine dimmer. Because they stop DNA replication, these uv damages can be lethal. But cells contain repair enzymes that cut out the damaged region, replace it with a new segment of polynucleotide using the intact strand as a template, and reseal the molecule with covalent bonds (detailed below). Excision (or dark) repair Absorption of uv was demonstrated by R. Setlow et al to cause cyclobutane rings (i.e. pyrimidine dimers between T-T, T-C, and C-C). They also demonstrated that repair through excision of such damage can occur in cells kept in the dark. This kind of DNA damage usu. distorts the DNA backbone and its genetic consequence is hindered replication or transcription. It has been shown that transcription is stopped at such points resulting in a shortened species of mRNA. Excision repair is also called the cut, patch, and seal repair process and may be undertaken as follows: Step 1: Cutting or incision. Done by a specific endonuclease that recognises the distorted area and cuts that strand on either sides of the damage. Step 2: Excision. This involves the 5- to 3- exonuclease activity of DNA Polymerase I, which digests away abnormal or chemically modified bases. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 92 OF 153 Step 3: Patching. Complementary base pairing in a 5- to 3- direction by DNA Polymerase I fills the gap created by excision. Step 4: Sealing or bridging. This is done by DNA ligase. This then completes the process. Photoreactivation The effect of uv radiation is to cause linkage among adjacent pyrimidine bases to form dimers, for instance adjacent thymines through C5 and C6. Exposing the irradiated cells to visible (or white) light can restore such damaged DNA. Photoreactivation is light-dependent and involves an enzyme photolyase (or PRE) that cleaves such links and so restores to normal the molecule. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 93 OF 153 c. SoS (i) An enzyme system involving DNA methyl transferase reverses the action of alkylating agents by transferring methyl or ethyl groups from a G or T to an internal cysteine residue; (ii) Also glycosylase enzymes remove such alkylated bases to leave sites, which need to be filled by an apurinic endonuclease repair system; for instance enzyme u-DNA glycosylase recognises u residues in the DNA and excises them leaving a gap. Topic Summary In this topic, we examined mutation as both source of heritable variation and source of harmful effects on the individual. This variation was internal and external to the gene and was caused in several ways. The reprieve however, was that these changes in the gene could be restored – thus the various repair mechanisms available to the cell. These repair mechanisms include: (1) repair by Dpol proofreading; (2) photoreactivation; (3) excision repair. Further reading (1) Suzuki, D.T., Griffiths, A.J.F., Miller, J.H., and Lewontin, R.C. (1986). An Introduction to Genetic Analysis (3/e). W.H. Freeman & Co./NY (2) Russell, P.J. (1992). Genetics (3/e). HarperCollins Publishers, NY Topic Activities Use the questions to gauge your understanding of this topic. 1. How would you classify gene mutations. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 94 OF 153 2. Qualify these statements as either true or false: (a) Mutations are caused by genetic recombination. (b) Mutations occur more frequently if there is a need for them. (c) Frameshifts are not gene mutations. (d) Ultraviolet light usually causes deletion of DNA segments. (e) Sense mutations lead to a longer polypeptide. 3. Mutagens induce gene mutations by at least three mechanisms. Which are these? 4. Demonstrate how a base analog may cause mutation. 5. Describe anyone repair mechanism familiar to you. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 95 OF 153 TOPIC 8. RECOMBINANT DNA TECHNOLOGY (RDT) Introduction Welcome to topic 8 of this course. Now you will learn and appreciate the revolution that followed the understanding the function of nucleic acids particularly DNA as the hereditary material. Scientists were challenged to extend their studies and research into the potential of developing techniques to cause DNA from unrelated organisms to recombine. The success of this approach has revolutionized agriculture, medicine, conservation, forensic science, and many other areas of life. Topic Time As a new and fairly recent topic, a slow presentation is necessary. Thus four lecture hours are projected for its coverage. The student needs more time for literature supplementation and/or do assignments that may arise. Topic Learning Requirements At the conclusion of this topic, there is strong need for the students to be taken to a molecular research laboratory for a hands-on exercise on DNA work – DNA extraction, manipulation and related studies. Learning Outcomes BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 96 OF 153 After successful completion of this topic, you should be able to: i. Define and explain recDNA technology (RDT). ii. Explain the basic steps in recombinant RNA work iii. Define both the promises and fears of this technology iv. Name some products that have arisen through molecular work Topic Content 8.1 Introduction From about the third quarter of C20th, public interest in genetics was heightened due to the development and application of particular molecular techniques collectively called recombinant DNA technology. Research in this area has led to the development of industries hinged on biotechnology, or genetic engineering. Benefits of applied genetic engineering have been realized and are growing particularly in Agriculture, medicine, forensic science, conservation, etc. Definition Recombinant DNA technology (or Gene cloning) is one aspect of genetic engineering and refers to a collection of experimental techniques that enables a scientist to identify, isolate and propagate fragments of the genetic material (DNA) in pure form. By this approach a scientist is then able to modify a DNA fragment and/or transfer them from one organism to another. The term RDT refers to the creation of a new association between DNA molecules or parts of DNA molecules which are not found together naturally. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 97 OF 153 The idea of deliberately changing the genetic makeup of an organism is as old as genetics itself, e.g. in 1927 H Muller showed that it was possible to induce mutations by x-rays. In fact since the 1970s genetic analyses have taken a biochemical approach and DNA sequence analysis is now a routine procedure. RDT techniques are now being used to understand heredity, evolution, gene control and human diseases Populations of organisms are able to adapt to changes in the environment because the populations are heterogeneous due to acquisition of new genes or the alteration of the existing genes. These natural processes of genetic restructuring form the essence of RDT with the important difference that DNA results are by man in the laboratory and natural processes are in the environment and are due to pressure of natural selection. Justification Early breakthroughs on DNA were made without actually being able to determine the nucleotide sequence of the DNA involved. These breakthroughs were on: 1) The genetic code, its nature; 2) The mRNA, its existence and role; 3) Translation, its mechanism; and 4) Gene regulation, its basic principles. That the breakthrough was indirect was because of: 1) The great size of the DNA molecule, 2) Chemical similarity of DNA molecules, and 3) Insufficient quantities of DNA of a given protein-coding gene. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 98 OF 153 Therefore, ways had to be found for isolating identifiable fragments of DNA of manageable length in sufficient quantity for detailed molecular analysis. This became possible in the 1970s with the identification in bacteria of enzymes called restriction endonucleases. These enzymes cut particular pieces from heterogeneous mixtures of DNA molecules. With these enzymes now on hand, there was an explosion of scientific activity as they now became manipulators instead of observers. They now had the means of reproducibly creating and combining fragments from different sources. 8.2 Legitimate vs illegitimate DNA combination Legitimate = recombination of DNA free of human intervention and within species. DNA recombination formerly was exclusively by meiosis in higher organisms and by conjugation, transduction and transformation in such organisms as bacteria. Where DNA was introduced across species its success depended on the existence of regions of genetic homology. When humans tried to take DNA across species, e.g. in interspecific and/or intergeneric hybridization (through cross pollination in plants and artificial insemination in animals) this was usually unsuccessful. Perhaps these failures were due to lack of genetic homology. Illegitimate = recombination of DNA molecules from different species with the help of restriction endonucleases. These enzymes therefore eliminated the need for regional genetic homology. All that was required exposed unpaired bases (Fig. 1). Illegitimate rDNA technology therefore involves: 1) Obtaining a DNA fragment of interest (insert DNA) and placing it into a vector (a biological carrier, e.g. plasmid, virus, etc), BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 99 OF 153 2) Introduction of such a molecule (now called rDNA molecule or clone) into a compatible host cell (a process called transformation). The host cell, as it divides, amplifies this DNA fragment to several copies 8.3 Promises of rDNA technology This technology swiftly, dramatically and forever changed the evolutionary face of the earth. In Scientific research, the last century can be remembered by at least two significant eras: 1) The Atomic age era (1945), and 2) The age of Molecular genetic engineering (1970). Both have undeniably changed the image, enlarged the ears, and promised unparalleled benefits to humankind. For instance, the Atomic age promised endless, cheap, clean energy. So far not fully realized and instead there is more worry about a radioactive polluted planet. Molecular genetic engineering (rDNA) promised: 1) Mass production of the then rare and expensive pharmaceuticals; 2) Inexpensive source of bioenergy; 3) Genetic therapy (the transfer of one or more normal or modified genes into an individual’s body cells to correct a genetic defect or boost resistance to disease); 4) New vaccines; 5) Super plants and animals; and 6) Infants largely free of defects due to prenatal diagnosis of genetic diseases Have any of the above been achieved/realized? This topic will endeavor to answer. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 100 OF 153 8.4 Techniques of recombinant DNA Technology The central technique in rDNA work is gene cloning (i.e. the production of many copies of an identical gene). Stages of gene cloning 1) Splicing DNA of interest to a cloning vector; 2) Introducing the rDNA into a host cell for amplification. 1. Producing rDNA : Splicing DNA In order to produce rDNA via DNA splicing, six requirements must be satisfied: 1) A restriction enzyme must be present to cut the cloning vehicle (vector); 2) The foreign DNA (insert DNA) must contain restriction sites similar to the restriction of the cloning vehicle; 3) A DNA ligase enzyme is necessary to seal the foreign DNA into the cloning vehicle; 4) A means of inserting the recombinant cloning vehicle into a host cell for multiplication be available; 5) The cloning vector must be able to replicate in a host cell; 6) A means of selecting those cells carrying the cloning vehicle and cloned DNA is necessary (1) Restriction Enzymes Some scientists argue that the discovery, characterization, and use of restriction enzymes marked the origin of genetic engineering. Discovery BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 101 OF 153 Occur only in micro-organisms where they act as an immune system by destroying infecting (invading) DNA molecules. However, host-cell DNA is not cut by its own restriction enzymes because the enzyme methylase has modified its sites with a methyl group (-CH3). In 1970, three scientists – Werner Arber, Daniel Nathans, and Hamilton O. Smith – discovered them. For this achievement, they shared the 1978 Nobel Prize in Medicine & Physiology- the first award on work related to genetic engineering. Types There are two or even three types: type I, type II, and type III. However, due to its specificity, only type II is widely used. The vast majority of type II recognize short, specific sequences that are 4, 5, 6 or 8 nucleotides (bps) in length and display two-fold symmetry (palindromes): i) EcoR1 (from E coli) cleaves ii) Pst1 (from Providencia stuartii) cleaves iii) Taq 1 (from Thermus aquaticus) cleaves iv) Not 1 cleaves G C Note: V v 5’ G V AATTC 5’ 3’ C V 5-T v T G C A G3’ CGA 3- GGCCGC = point of incision These molecular scalpels provide a powerful tool for generating unique fragments from very large DNA molecules. They are sequence specific irrespective of the source of such DNA. They also facilitate the subsequent joining of DNA. Number Today thousands of these enzymes having cleavage specificities (varying in both nucleotide sequences and length of the recognition sites) are commercially available as analytical reagents. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 102 OF 153 Nomenclature A restriction enzyme is named according to the organism from which it was isolated. For instance the enzyme EcoR1, the 1st letter is from the genus of the bacterium (i.e. Escherichia). The next two letters are from the name of the species (i.e. coli). An additional subscript letter indicates the type or strain, and a final number is appended to indicate the order in which the enzyme was discovered in that particular organism. (2) Restriction sites and DNA cleavage Restriction sites A restriction site is a sequence of DNA that is recognized by a restriction enzyme. DNA molecules are extremely long and featureless. Thus, it is hard to isolate a particular fragment of DNA or determine its sequence. The restriction sites that occur by chance in all DNAs provide natural chemical guideposts along the DNA. Because most restriction sites are only a few bps long, they will occur in any DNA of sufficient length whatever its origin. A particular restriction enzyme will therefore always cut a given DNA at the same point (s) generating a set of specific DNA fragments (restriction fragments) that can be sized and separated by starch gel electrophoresis. For each restriction enzyme, it is possible to calculate the average length of the fragments the enzyme will generate and then use that information to estimate the approximate number and distribution of recognition sites in the genomes. The estimate depends on two simplifying assumptions: BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 103 OF 153 (1) that each of the 4 bases occurs in equal proportions such that a genome is composed of 25% A, 25% G, 25% C and 25% T. (2) that the bases are randomly distributed in the DNA sequence. NOTE: These assumptions are never precisely valid, but they enable us to determine the average distance between recognition sites of any length by the general formula, 4n , (n = number of bases in the sites), viz: (a) The probability that a 4-base recognition site will be found in a genome is (¼)4 or ¼ x ¼ x ¼ x ¼ = 1/256. (b) The probability that a 6-base recognition site will be found in a genome is (¼)6 = 1/4,096. Based on this premise, an enzyme that: (a) recognizes 4-base sequence (e.g. GTAC for Rsa 1) will cut on average, once every 44, or every 256 bps creating fragments averaging 256 bps in length. (b) EcoR1 that recognizes the 6-base sequence (GAATTC) will cut on average once every 46, or 4,096 bps (or 4.1 kb). (c) Not1 which recognizes the 8-bases GCGGCCGC will cut on the average every 48 bp (or every 65.5 kb). However, because the actual distances between restriction sites for any enzyme vary considerably, very few of the fragments produced by the three enzymes above will be precisely 65.5 kb, 4.1 kb, or 256 bp in length. The duration of contact between the restriction enzyme and the DNA determines fragment size. If DNA is exposed long enough to any restriction BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 104 OF 153 enzyme, the result is a complete digest, i.e. DNA cut at every one of the recognition sites it contains. Partial digest leads to larger fragments and follows controlled enzyme quantity or duration of digestion. Because the exact cut sites in the DNA made by each restriction enzyme are known, computers can be used to predict if and where a gene may be cut by particular enzymes and specific enzymes chosen accordingly. DNA cleavage Restriction enzymes cleave DNA so as to produce a 3- OH group on one end and a 5- PO4 group on the other. In so doing: (i) Some restriction enzymes, such as Taq 1, cleave each strand at similar locations, on opposite sides of the axis of symmetry, generating fragments of DNA that carry protruding single strand termini (i.e. “sticky” or cohesive ends). cutting point G A A T T C G G A T C C C T T A A G C C T A G G EcoR1 (E. coli) (ii) Bam1 (B. amyloliquefaciens) Other restriction enzymes, such as Hpa 1, Hae III, cleave in the middle of their recognition sequence to produce fragments that have blunt ends that do not hydrogen-bond. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 105 OF 153 Hae III --- G G C C --- --- C C G G --- = ---- G G ---- C C + C C --G G --- Cloning vector (vehicle) Definition: A vector is a DNA molecule which can replicate in a suitable host organism and into which a fragment of DNA may be introduced. Characteristics: (i) Be easily introduced into the host organism by transformation (plasmids) or transfection (phages); (ii) Have an origin of replication. This is a sequence of DNA that allows the vector DNA to be replicated in the host cell; (iii) Have at least one cleavage site for a restriction enzyme. This site called the cloning site is where the insert DNA is incorporated into the cloning vector (iv) Must encode a gene whose product distinguishes transformed host cells from untransformed cells. For example, many cloning vectors carry a gene that confers resistance to an antibiotic; cells transformed with such vectors will grow in media containing the antibiotic, whereas untransformed cells will be killed. Examples Three types of vectors are commonly used for cloning DNA in bacterial cells: plasmids, bacteriophages, and genetic elements called cosmids. Each differs in the way it must be manipulated to BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 106 OF 153 clone pieces of DNA and in the maximum amount of DNA that can be cloned using it. (i) Bacterial plasmids are extrachromosomal genetic elements that replicate autonomously within bacterial cells, e.g. Ti plasmids in Agrobacterium tumefaciens; (ii) Cosmids: constructed DNA carriers similar to naturally occurring plasmids. Can carry up to 49 kbps. Thus are suitable to carry large eukaryotic genes DNA. (iii) Viruses, examples are sv 40 and lambda phages. 2. Host Systems Recombinant DNA is introduced into a compatible host in a process called transformation. Host cells that pick up a DNA molecule are called transformants or transformed cells. A single transformant then undergoes many cycles of cell division to yield a colony of cells with the recombinant DNA molecule. This DNA can then be isolated, purified, and analyzed. Potentially, the cloned DNA may be transcribed, its mRNA translated, and the gene product isolated and studied. The common hosts are E. coli or S. cerevisiae. After amplification, hosts carrying the hybrid vector are selected, and the DNA therein isolated. Topic Summary In this topic you have been introduced to a more recent is approach for DNA isolation, manipulation, analysis and potential of this technology in biological, medical, agricultural and forensic research. This approach is the ‘recombinant DNA technology’ (recDNA or RDT). The key element in this technology are the BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 107 OF 153 restriction enzymes popularly called restriction endonucleases. The technology revolves around enzymes, vectors and host systems. Through them we are able to amplify an identified DNA piece that can then be used as needed. The emergence of the polymerase chain reaction (PCR) has reshaped this procedure. This you will cover at the next level. Further Reading (1) Russel, P.J. (1992). Genetics (3/e). HarperCollins Publishers, NY. (2) Jones, M., and Jones, G. (1997). Advanced Biology. Cambridge Univ. Press, UK Topic Activities Use these questions to test your understanding of this topic. 1. Define RDT and give a brief background to its origin and growth. 2. What were the promises and promises of this technology? 3. What do you consider are the ethical, moral and legal issues in this technology? 4. The restriction enzymes identified cut DNA in different sizes generating various numbers of fragments. Suppose the DNA molecule is 3billion bps and you are given 4-base, 6-base and 8base cutters, how many fragments will each generate? 5. Outline the process of cloning a DNA fragment. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 108 OF 153 TOPIC 9: GENETICS OF MICRO-ORGANISMS (BACTERIA AND PHAGES) Introduction to the Topic This topic forms the second part of this course, that is microbial genetics. In it we review the decision by researchers of genetic events to revert to microorganisms as organisms of choice in understanding molecular heredity. Microorganisms, like macroorganisms have nucleic acids in particular DNA as the blue print of life. Scientists involved in molecular heredity noted that these organisms offered numerous advantages compared to large organisms in studying molecular hereditary events and thus the shift in using them as tools for molecular research. Whereas most microorganisms reproduce asexually, thus limiting variation which is fundamental in phenotypic comparisons, variation nonetheless was possible. Scientists like Benzer analysed the fine structure of a gene through a bacterium and so did many others in specific aspects of molecular heredity. Topic Time A total of 10 hours is allocated to this topic given both its importance and the aspects to be covered. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 109 OF 153 Learning Requirements No specific tools or items are required. All you need to do is supplement these notes with relevant references (both online and hardcopy textbooks). Learning Outcomes After successful completion of this topic, you should be able to: i. List the advantages and disadvantages of use of microorganisms in molecular studies/research. ii. List some of the main microorganisms used in molecular studies and the areas of studies they have been used. iii. Demonstrate how variation can be generated in bacteria, organisms that largely reproduce asexually. iv. Explain the possible mechanisms of genetic exchange in bacteria and phages. v. Illustrate how gene mapping can be done using the three mechanisms of genetic exchange in microorganisms. Topic Content 9.1 Introduction A very large part of the history of genetics and of current genetics analysis (particularly molecular genetics) is concerned with microbes and viruses. Microbes include bacteria, fungi, algae, etc. All share a common property with the eukaryotes - they have DNA, the material of heredity. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 110 OF 153 Microorganisms are a useful biological material. They are chosen because they provide good illustrations of important molecular events, e.g. the replication, recombination and rearrangement of DNA; the control of genes, etc. Those scientists, who studied heredity beginning in the 1940s, utilized mainly microorganisms for their genetic studies. Until the 1930s, virtually all genetic research had been done with animals and plants. Multicellular, macroscopic organisms offer the convenience that the characters whose hereditary transmission is to be studied can generally be recognized at sight. However, plants and animals have the disadvantage that the number of individuals that can be examined in any one experiment is limited to a few hundreds or at most a few thousands. Also their lifecycles are long. To overcome these limitations, the founders of molecular genetics turned to bacteria- microscopic organisms whose lifecycles last less than one hour, and of which billions of individuals can be raised overnight in a Petridis of nutrient growth medium. Although Anton van Leeuwenhoek discovered bacteria in the latter part of the C17th, little else was done on bacteria for a further 200 years. Then microbiology took its upswing in C19th. This was in response to the spontaneous generation theory widely acclaimed at the time. In late C19th a vast ensemble of bacteria of the most diverse shapes, sizes, and functions were discovered. These discoveries were soon to alter the human condition, in that they led to the rationalization of the making of good wine, beer and cheese, and to the conquest of infectious diseases. Other than being man's little friends and enemies, bacteria were soon acknowledged as a major component in the balance of the biosphere. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 111 OF 153 Indeed, macroscopic life as we know it would not be long for this world if it were not for bacterial intervention in the constant recycling of Nitrogen and Carbon between the atmosphere and organic matter! The use of viruses and bacteria as genetic research tools has led to major breakthroughs in our understanding of genetic processes. 9.2 Advantages of micro-organisms in Genetic Studies To verify traits or properties of genes can be complex especially in higher organisms. Microorganisms have been found less complex to study such traits for the following reasons: (1). Simplicity of the growth medium and the rapid growth rate. Bacteria can grow on minimal medium which often comprises: a carbon source e.g. glucose or glycerol; a nitrogen source (as NH4+ or an organic compound) such as histidine; salts, e.g. Na+, K+, Ca2+, Mg2+, S042- and PO43-, and trace elements. (2) Short life cycles: Micro-organisms have extremely short reproductive lifecycles of between 20 minutes to 1 hour. This together with their very simple growth requirements, makes them ideal research organisms. (3). Ease with which genetically distinct populations of bacterial cells can be obtained and thereby ensure pure cultures. (4). Haploid natures, i.e. microorganisms have only a single copy of its hereditary information. Dominance effects hide none of its genes. (5). Save for interaction, every gene, which the individual possesses, can be expressed in its phenotype. Thus their study is simple. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 112 OF 153 (6). Large populations of them, which are generated rapidly and are economical to keep. Such large populations allow for evaluation of such rare events as mutations. 9.3 Disadvantages (1). Scarcity of recombination events. (2). Reproduction almost wholly asexual, thus generating little genetic variability. (3). Lack of morphological variation at individual level. Variations could occur at clone level affecting such traits (phenotypes) as size/unit time, shape of perimeter or growth habit, texture, color, etc.; response to nutrients, resistance to antibiotics and bacteriophages, etc. 9.4 Important microbial and phage strains In the variety of micro-organisms, several strains of each are studied, viz: Bacteria (1). Escherichia coli - this is the most widely studied species of bacteria. It is usually prototrophic (i.e. capable of growing on a defined minimal medium) but occasional auxotrophs (mutant strains) exist. E. coli has an estimated 2,000 genes with about 1/2 already mapped; (2). Salmonella species e.g. S. typhi, the cause of typhoid, and S. typhimurium, which is popular among molecular biologists because one of the most important processes of genetic transfer between bacterial cells happened to be discovered with it. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 113 OF 153 (3). Streptococcus pneumoniae and Bacillus subtilis; B. thuringiensis; B. anthracis (4). Agrobacterium tumefaciens- carries the Ti plasmid Phages Work on phage genetics was initiated in the 1939s by Max Delbruck and colleagues. In the 1940s, he, along with Salvatore Luria and A. D. Hershey discovered genetic recombination in phages. Thereafter, phages have been extensively used as tools for the study of gene structure and function. The much investigated phages are: (1) Phage lambda (). It was studied exclusively as a model of phage gene expression. (2) The T- phages that infect E. coli, both T-even and T-odd. The T-even include T2, T4. The T4 (whose size is 166 Kbps; with 150 genes); T2 was used by A. Hershey and M. Chase to prove that DNA is the genetic material; The T-odd include T-1, T-3, T-5, and T-7. (3) Phage M13 has single stranded circular DNA with mol. Wt. 1 9.5 Bacterial Genetics Bacteria affect every aspect of our daily lives. They cause disease, provide us with food and medicines, and dispose our wastes. Bacteria are essential to the genetic engineer and are crucial research tools in biochemical genetics. 9.5.1 Bacterial phenotypes Bacteria show a variety of phenotypes that are expressed mainly through mutations in the wild type through: BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 114 OF 153 a) Colony morphology; b) Inability to utilise a given sugar or synthesise an essential growth substance, c) Resistance to drugs, antibiotics and bacteriophages, 9.5.2 Genetic Recombination Although bacteria and viruses are ideal subjects for biochemical analysis, they'd not be useful for genetic study if they did not have sexual processes. Such exchange processes create genetic variation in the various species and strains of bacteria and can be used to map bacterial and viral chromosomes. In the 1940s and the 1950s, the availability of genetically different bacterial strains led to the study of a number of means by which genes could be exchanged between bacteria. These mechanisms or processes are: conjugation, transduction, transformation and to a less extent sex-duction. a) Conjugation is the unidirectional transfer of genetic information through the formation of a cytoplasmic bridge from one cell to another. b) Transduction - gene transfer from one bacterium to another mediated by a phage. c) Transformation - direct uptake of free DNA by bacteria. d) Sexduction - bacterial genes carried from one cell to another on a molecule of DNA which acts as a sex factor called the F. This sex factor can reside in a bacterial chromosome or it may exist as an autonomous unit extrachromosomally. Genetic elements with such dual capacities are called episomes. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 115 OF 153 These mechanisms, in conjunction with recombination processes, are largely responsible for the rapid evolution of bacteria. The exploitation by geneticists of all three systems of gene transfer has been of crucial importance for the construction of new strains and the analysis and manipulation of a wide range of biological processes. 1. Transformation Griffith discovered this mechanism of gene transfer in 1928 with Streptococcus pneumoniae. Thereafter, DNA was identified as the active principle in transformation (Avery et al., 1944; Hershey and Chase, 1952) (Fig. 1). This was thus the earliest known means of gene transfer between bacterial cells. The frequency of occurrence of transformation is about 25% or less. Bacteria that are able to take up and incorporate free DNA into their genomes are said to be competent. Some bacterial strains are highly competent during one or more phases of growth under normal laboratory conditions, whereas others require special treatments to be rendered competent. In E. coli, transformation is only possible after inducing competence through Ca++ treatment and subjection to heat shock. The process of transformation can be a modification of the following: 1. Development of competence; 2. Binding of ds DNA molecules to bacteria; 3. Uptake of ds into the bacterium followed by its degradation into single strand; 4. Coating of ss molecules with a specific protein that protects DNA from nucleases. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 116 OF 153 5. Integration by recombination of the transformed DNA strand into recipient chromosome at the point at which it is homologous. 6. Replication of integrated DNA segment and segregation of recipient and donor alleles in progeny bacteria to yield transformed DNA clones with new phenotypes. 2. Conjugation 1) The discovery of bacterial gene transfer Can bacteria exchange hereditary information? Joshua Lederberg, a 21-year graduate student supervised by Edward Tatum provided the proof in 1946. They observed that two nutritional mutant strains of E. coli K-12 have the ability to complement each other by transferring genetic material during cellular contacts. Illustration: Strain A could only grow on a medium supplemented with the amino acid methionine (met) and the vitamin biotin (bio) (genotype: met- bio- thr+ leu+ thi+); Strain B could only grow on a medium supplemented with acids threonine (thr) and leucine (leu), and vitamin thiamine (thi) (genotype: met+ bio+ thrleu- thi-). Experimental set-up: Cultures of the two strains growing on supplemented minimal media were set on plates of minimal media as follows: Plate 1 - strain A, Plate 2 - strain B, Plate 3 - a mixture of the two strains. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 117 OF 153 Observations were made for any colony formations. Results: Plate 1- No colonies (i.e. no growth), Plate 2- No colonies (i.e. no growth, Plate 3 - A few colonies were observed, and they arose at a frequency of about 1 in 10 million cells (i.e.1 x 10-7). Interpretation: Because only prototrophic colonies, met+ bio+ thr+ leu+ thi+, are able to grow on minimal medium, the appearance of some colonies suggested that is recombination of genes had occurred. The question then arose on the origin of prototroph colonies whether from mutation, transformation or recombination by some form of sexual union. Mutation was ruled out as a source of recombination because it should have occurred in all plates. Conclusion: The wild-type colonies were produced by an exchange of genetic material between the two strains i.e. E. coli has a sexual phase that can bring together the chromosomes of the two different cells. Crossing over could then place in one chromosome good copies of all its necessary genes. 2) The discovery of the fertility factor, F. In 1953, William Hayes, a British geneticist, and Jacob and Wollman, two French geneticists obtained evidence that the genetic exchange in E. coli is unidirectional, not reciprocal. This was through the following study: BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 118 OF 153 Materials Used two nutritional mutant strains, A and B. Strain A: met- thr+ leu+ thi+ (auxotrophic for met allele), Strain B: met+ thr- leu- thi- (auxotrophic for thr, leu and thi alleles). Experimental procedure: 1. Strain A culture treated with streptomycin (an antibiotic that does not kill but prevents cells division), then mixed with strain B and observed for growth; 2. Strain B culture treated with streptomycin and then strain A added to it and observed for growth. 3. The 2 strains mixed together without antibiotic, then plated on minimal medium. Results There was growth in experiment 1 with colony frequency similar to that in experiment 3 but none in experiment 2. Explanation The division of strain B and not of strain A was necessary for production of colonies. The donor could still transfer genes even after exposure to the antibiotic but the recipient could not divide to produce colonies if it had been exposed to the antibiotic. Conclusion Genetic transfer in bacterial cells is non-reciprocal; cells of A donate genetic material to cells of B, which then can divide to produce colonies containing BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 119 OF 153 genes from both strains. Therefore, A is the donor and B the recipient. They termed the process conjugation. Bacterial cells carry besides the main chromosome, one or more small DNA molecules in the cytoplasm called plasmids. Of the various kinds of plasmids, a few are involved in conjugation and are called conjugative plasmids. One such conjugative plasmid is the sex element or fertility factor or F factor. Therefore, in a subsequent study, Hayes discovered this factor (i.e. F). The donor cells have it (and are designated F+) while the recipients lack it (and are F-). The F+ cells are potentially male since they cannot transfer their genes to the F- cells. This F factor can be autonomous or be integrated. In the latter case it is referred to as Hfr (High frequency of recombination) and in this instance can donate its genes. The F is seldom transferred. Thus unless the entire chromosome is transferred, the ex-conjugant (the recipient cell after conjugation is completed) of Hfr x F- is always F-. The newly introduced DNA strand recombines with the DNA of the recipient cell. Subsequent DNA replication and cell division give rise to a new recombinant cell that has characteristics derived from each of the parental cells (Fig. 2). The partial diploid cells after conjugation are called merozygotes. The relative frequency with which male gametes are incorporated into the recombinant chromosome is a measure of how close they are to the entering end. 3. Transduction Most phages contain DNA. During growth and reproduction, some phages incorporate host cell DNA into their own DNA. If the phage then infects another cell, the DNA of the first host cell may be transferred to the second. This is transduction and was demonstrated by J. Lederberg and Z. Norton, his postgraduate student in 1951 (Fig. 3). BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 120 OF 153 They used two auxotrophic strains of Salmonella typhimurium. Strain 1 (LA 22): tyrStrain 2 (LA 2): tyr+ phe- trp- met+ his+, phe+ trp+ met- his-. When they were separately plated on minimal medium, there was no growth. Then using a Davis U-tube each arm had one of the two strains (diagram). After a time, cultures from each arm were removed and examined for growth in minimal media. Results The culture of LA22 grew while that of LA2 did not. The transducing agent was a phage P22, that is transient in LA22 but lytic to LA2. When it attaches to a new host cell the fragment of bacterial chromosome in injected into the cell. It then can engage in crossing over with the host chromosome ultimately leading to the production of a genetically altered bacterium. For example a P1 particles grown on E. coli able to grow on lactose has a number of lactose genes. Addition of such phages to an E. coli unable to use lactose (Lac-) transforms some Lac- bacteria to the Lac+ form by means of genetic recombination. However, it is generally a very rare process. Phages are categorised into special (restricted) and general types. The special types pick up specific regions of the host bacterial chromosome, e.g. in E. coli on a region that includes a cluster of genes controlling galactose fermentation. General transducing phages acquire DNA randomly from the bacterial chromosome, e.g.22 on Salmonella. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 121 OF 153 9.5.3 Mapping the bacterial chromosome Bacterial gene mapping is done somewhat differently from higher organisms. Methods include interrupted conjugation, uninterrupted conjugation, complementation mapping, mapping by deletion mutants. 1. Interrupted conjugation Interrupted mating is a technique used to map bacterial genes by determining the sequence in which donor genes enter recipient cells. Conjugation is disrupted after specified time intervals. It was pioneered by Francois Jacob and Ellie Wollman (1952). They used the Hfr strain- a strain in which the F factor was integrated in the chromosome. This strain therefore acted as the donor. They noted that when the Hfr strain conjugated, the genes closely behind the leading part of the F plasmid were transferred efficiently but the trailing genes made it less frequently since the conjugation tube was more likely to break before they made it though. They devised a method to control it. The strains used were: Hfr: Strs a- b+ c+ d+ ,and F-: Strr a- b- c- d-. These were put together for some time. At specific time intervals (e.g. 5, 10 minutes), they removed samples. Each such sample was then put into a warring blender for a few seconds to disrupt the mating cell pairs and then plated on minimal medium containing streptomycin. On such medium all unmated Hfr cells were killed, unmated F- cells were unable to grow. This treatment therefore allowed only F- cells with recombinant genotype to form colonies. The strr cells then were tested using selective media for the presence of marker alleles from the donor. Transfer of donor allele of each of these steps is obviously dependent on the time that conjugation is allowed to continue. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 122 OF 153 Results i) They found that when the genetic transfer was interrupted after a few minutes, some of the genetic markers would make it across, while others always remained behind. ii) As the periods of mating were extended, more and more markers were transferred and could be recovered in the progeny. iii) The markers that appeared with greater frequency obviously were nearer the leading end of the length of the DNA. The frequency of appearance fell off for those that were more distant. In this way, the gene sequence could be mapped. This experiment also showed that bacterial genes occur in a definite sequence, the sequence in which they are transferred into the F- cell. This information was useful in the construction of a gene map using as a measure of "distance" the time at which the donor alleles first appear after mating. The units of distance in this case are minutes. Thus if marker b+ begins to enter the F- cell 10 minutes after a+ had entered, then a+ and b+ markers are 10 minutes (units) apart on the genetic map. Their study involved the following cross: Donor: Hfr H strs thr+ leu+ azir tonr lac+ gal+ strr thr leu tons lac gal Recipient: F- azis The Hfr H strain is prototrophic and is sensitive to the antibiotic streptomycin. The F- (strain) cell carries a streptomycin resistance gene and also a number of mutant genes. Thus it was auxotrophic for threonine (thr) BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 123 OF 153 and leucine (leu), to be sensitive to sodium azide (azis) and infection by the phage T1 (tons), and to be unable to ferment lactose (lac) or galactose (gal) as a carbon source. Following the above procedure, they produced the gene map below: Minutes 0 5 10 azi Origin 15 ton 20 lac 25 gal The genetic map obtained through analysis of interrupted matings is the same as that arrived at by analysis of frequencies of various recombinant classes. Note: Interrupted mating is however useful for physically mapping genes separate by large distances but is not at all useful for mapping markers separated by only 1 or 2 minutes on the chromosome. These can be ordered and mapped by recombinational analysis. Example An Hfr strain carrying the prototrophic markers a+ b+ c+ is mixed with a Fstrain carrying the auxotrophic alleles a, b, c. Conjugation was interrupted at 5 minutes intervals and plated on media which revealed the presence of recombinants. Time (minutes) Recombinants 5 BOTA 413: MOLECULAR AND MICROBIAL GENETICS a b+ c PAGE 124 OF 153 10 a b+ c+ 15 a+ b+ c+ Determine the gene order. Solution: The order of the genes in the Hfr donor strain is b+ c+ a+ b is < 5 time units from the origin; c is < 10 time units from b, a is < 10 time units from c. 9.6 Phage Genetics 9.6.1 Introduction Depending upon the host which a virus infects, they can be broadly placed in three categories: (i) animal viruses, (ii) plant viruses and (iii) bacterial viruses or bacteriophages. The simplest of these three classes are bacteriophages. Bacteriophages are also a class of viruses, which have been most extensively utilised for study of genetics of viruses. Contemporary with the discovery of sexuality in bacteria in 1946 by Lederberg and Tatum, genetic recombination was also demonstrated between different strains of bacteriophages by M. Delbruck, W.T. Bailey and A.D. Hershey. These workers were also able to detect and artificially induce mutations in these organisms. Considerable genetic work has since been undertaken by these and other workers using bacteriophages. Drs. A.D. Hershey, M. Delbruck BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 125 OF 153 and S.E. Luria shared the 1969 Nobel Prize in medicine for their contribution to replication and recombination in viruses. 9.6.2 Reasons for suitability of phages as genetics research tools (1) They have a very rapid rate of multiplication. After infecting a host cell, it takes about 20 minutes to lyse it releasing hundreds of phages into the environment. (2) For the reasons above, phages can be used to observe and score rare genetic events, such as mutation and recombination. For example, if a cell is double infected with two genetically different strains of a particular phage, then the two parental types, together with both types of recombinant, can be recovered only after 20 minutes. By plating these progeny phages on a suitable host strain of bacteria, the parental and recombinant types may be recognised after 24 hours or less. A similar experiment with Drosophila would take at least 3 – 4 weeks, and over a year with peas! (3) Simple size and simplicity of genetic organization. The average phage genome is very small. For example, the genome of phage X174 has only got 9 genes whilst that of lambda (λ) has less than 60 genes. This fact means that it is possible to identify practically all the genes within a viral genome. Therefore, one can investigate and understand the organization and regulation of an entire genome. Aside from their utility in the study of gene expression, phage genetics has been put to practical use as well. Cloning of the human insulin gene in bacteria was accomplished using a bacteriophage as a BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 126 OF 153 vector. The phage delivered to the bacterium a recombinant plasmid containing the insulin gene. Viruses consist of a genome (single stranded or double stranded RNA or DNA) enclosed in a capsid. They replicate using the metabolic machinery of their specific host cell. Phages may be virulent and follow the lytic cycle or temperate with a lysogenic replication cycle. 9.6.3 Phage Crosses Viruses can also transfer some of genetic material to one another, a kind of sexual reproduction. That recombination can occur in some phages was first demonstrated in 1946 by Max Delbruck and Bailey. They used two virus types: T2 and its mutant. The phage T2 forms small colonies (plaques) due to slow lysis of the host, and thus designated T2r+; Its mutant (T4r) forms large, circular plaques due to rapid lysis of host bacterium. By simultaneously infecting E. coli strain B and examining the lysate, progeny T2r and T4r+ was found. These are new types (recombinants). Their frequency ruled out back mutation. A transfer of genes controlling plaque type had taken place between these strains. Since recombination is possible even in viruses, their genomes can as well be mapped. 1. Recombination in phages After Delbruck and Bailey's studies, Seymour Benzer also studied genetic exchange in viruses and made his findings in 1953. He worked with the bacterium E. coli and the phage T4 and is class of mutants that produces a phenotype called rapid lysis, r. There are 3 genes, I, II and III in the фT4 BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 127 OF 153 genome that can mutate to rapid lyser forms symbolised as rI, rII and rIII. Benzer chose and worked with the rII locus basically because all rII alleles are conditional lethals, i.e. an rII mutant will grow and form large plaques on E. coli strain B but it will not grow at all on strain K12. In contrast, rII+ (normal or wild-type) phages will grow and form small, ragged plaques on both strains (below). This conditional growth was to prove essential for identifying recombinants. Plaque phenotypes produced by different combinations of E. coli and phage strains T4 phage strain E. coli str. B E. coli str. K 12 rII Large, round No plaque rII+ Small, ragged Small, ragged Benzer started with a sample of 8 independently derived rII mutant strains and set about crossing them in all possible combinations by double infection (mixed infection) of the B strain, as below: BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 128 OF 153 rII-- allele rII- allele E.coli str B Plate on E. coli K12 Results: (i) rII- won't grow, i.e. is parental (ii) rII-- won't grow, i.e. parental (iii) rII+ will grow, i.e. forms plaques* * The formation of plaques indicates that recombination has occurred within a gene and not just between genes. The frequency of plaques was too large for them to be due to back mutations. Furthermore, the alleles could be mapped unambiguously to the right or left of each other to give a gene map as below with the map units being the frequency of rII+ plaques, i.e. Map distance = 2 𝑥 𝑛𝑜.𝑜𝑓 𝑟𝐼𝐼 𝑝𝑟𝑜𝑔𝑒𝑛𝑦 𝑡𝑜𝑡𝑎𝑙 𝑛𝑜.𝑜𝑓 𝑝𝑟𝑜𝑔𝑒𝑛𝑦 x 100 The result was an intragenic map below: rII1 I rII7 I rII4 I rII5 rII2 I BOTA 413: MOLECULAR AND MICROBIAL GENETICS rII8 I rII3 I I PAGE 129 OF 153 rII6 I Mixed infection therefore provides a technique for testing functional and structural allelism between different mutants of a phage in pairwise combinations. If there was no growth this suggested functional allelism, and if there was growth, complementation in trans arrangement. Benzer then isolated more rII mutants and carried out double infection on E. coli K12. Benzer was able to categorise these mutants into complementary groups, he arbitrarily called A and B. Any member of group A will complement any member of group B, but no complementation is possible between any pairs of alleles in the same group. Mutations that fail to complement must be affecting the same unit of function. 2. Mapping This requires consideration of deletions (mutants) and therefore an understanding of the technique of complementation or helping. (a) Complementation Two mutants complement each other if each one can supply a function the other lacks. A complementation test therefore defines genes in any organism or virus. Recessive mutations with similar phenotypic effects can be tested for functional allelism by determining the phenotypes of the cis and trans heterozygotes (Fig. below). BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 130 OF 153 cis test Mutant sites in the same cistron Ix x I I I I I A Mutant sites in different cistrons x I I I x I B I x I A trans test I I xI I A B I I x I I I I B A I x I B x = mutant sites A, B = parts of rII locus (cistrons). A cistron is a genetic region within which there is no complementation between mutations. If two different rII mutants of the phage T4 infect the same bacterium (trans configuration) and no growth of phage occurs, the mutants are said to be in the same cistron. But if complementation occurs (i.e. progeny phage grow), the mutants are in different functional units (lower right box above). In the upper right box, only the B function can be expressed normally. If the trans heterozygote shows a wild-type phenotype, the mutants are supposed to have occurred in different alleles of the same gene. Cis heterozygotes always show a wild phenotype. Thus mutations are allelic if cis and trans arrangement result into different phenotypes. They are nonallelic if both these arrangements result into the same (wild-type) phenotype. When two mutations in the trans position restore the wild phenotype, they are said to complement with each other. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 131 OF 153 Using complementation data, complementation maps can be drawn in which the extent of non-complementation is represented by a straight line and complementation by the absence of an overlap between these lines. In the case of Benzer, he confirmed complementation among various rII mutants of T4. Worked out Examples (1) Two independent mutations of the T4 have been isolated. Both lack the ability to form plaques in bacterial cultures. In an attempt to determine whether these mutants were allelic or simply different structural genes responsible for similar phenotypes, cross infections of bacterial cells were initiated. The cultures were examined for plaques, and were noticed. Are these two mutations alleles of the same locus? Answer Since neither mutation by itself can produce plaques in a bacterial culture, their phenotype is indeed similar. Simultaneous infection leads to a diploid state of phage chromosome within the same cell. If mutations are alleles of one another at the same locus, neither chromosome will have the ability to produce plaques. If the mutations are not alleles, but each represents a mutation of different structural genes, loci A and B, then when the diploid state is present, there will be a wild type allele present for both locus A and locus B, each complementing one another to produce plaques. The presence of plaques identifies that these two independent mutations are not alleles of one another. (2) Seven new mutations were identified in the bacterium E. coli during some sexual conjugation tests for recombination. Each of the mutations prevented normal metabolism of galactose and were identified as galA, galB, galC, galC1, galC2, galD and galD1. When partial diploids in all pairwise BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 132 OF 153 combinations were examined by complementation tests, the following results were obtained (+ = normal complementation; 0 = no complementation). Mutant A B C C1 C2 D D1 A 0 + 0 0 + 0 + 0 + + 0 + + 0 0 + 0 + 0 + 0 + 0 + + 0 + B C C1 C2 D D1 0 Questions: a) How many loci are there? b) Which mutations are alleles of one another? Solution i. Mutant A complements B, C2 and D1 but not C, C1 and D. Thus A, C, C1, D; B, C2, D1 ii. Mutant B complements A, C, C1, D and D1 but not C2. Thus A, C, C1, D; B, C2; D1 iii. Mutant C complements B, C2 and D1 but not A, C1 and D BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 133 OF 153 Thus the loci are as above. iv. Mutant C1 complements B, C2 and D1 but not A, C, and D. Thus loci are as above. v. Mutant D1 complements all the other mutants. Thus loci are as above. Therefore: a) There are 3 loci b) A, C, C1 and D, are alleles of one locus; B and C2 are alleles of a second locus and D1 is the only gene for the third locus. (3) Six mutations are known to belong to three cistrons. From the results of the complementation tests, determine which mutants are in the same cistron. 1 2 3 4 5 6 1 2 3 0 + + 0 0 4 5 + + + 0 0 6 + 0 + 0 BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 134 OF 153 Key: + = complementation; 0 = non-complementation; blank = not tested Solution i. Obviously mutations 3 and 5 are in the same cistron - they fail to complement each other. ii. Mutations 1 and 3 are in different cistrons - they complement each other. Thus are assigned as below to two cistrons, A and B. Cistron A Cistron B Cistron C iii. 1 and 2 are in different cistrons, but we do not know whether 2 is in A or C. However, 5 and 2 complement and therefore cannot be either in cistron A or B and thus must be in C. Cistron A Cistron B Cistron C iv. 3 and 4 complement, thus 4 must be in either B or V. But 2 and 4 also complement, thus 4 cannot be in C and must reside in B. Cistron A Cistron B Cistron C v. Mutant 6 cannot be in A since it complements with 5. Thus 6 is either in B or C. Since 6 and 4 complement, they are in different cistrons. If 6 cannot be in A or B, it must be in C. The mutants are thus grouped: Cistron A Cistron B Cistron C BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 135 OF 153 3. Six point mutants are known to reside in 3 cistrons. Complete the following table (with + for complementation; 0 = non-complementation) 1 1 2 0 + 2 0 3 3 4 5 6 + + + + 0 4 + 0 0 5 0 6 + 0 Answer: Cistron A Cistron B 1 Cistron C 2 3 4 5 6 + + + + + 0 + + 0 + 0 0 + 0 0 + 0 5 0 + 6 0 1 2 3 4 0 BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 136 OF 153 Deletion mapping Besides using the properties of mutational sites for intragenic mapping, Benzer introduced deletion mapping to enhance understanding of the finer gene structure. Deletions are mutations that result from the elimination of segments of DNA. Mutations cannot be converted into wild-types in recombination, because the DNA corresponding to the wild-type region for that particular mutation is no longer present. To understand deletions, take an analogy with record tapes. Given 3 tapes, each with a blemish or deletion in a different place labelled A, B and C. Imagine these deletions are so located that deletion B overlaps deletions A and C but that A and C do not overlap each other. In this case only recombining A and C can recreate a good performance. A deletion is shown as a bar because it represents the loss of several sites. If the nucleotides of a gene are: a, b. c, etc., then each point mutation is a change in one nucleotide but a deletion is the loss of a whole series of nucleotides, viz: a b c d e f g h; a b c d e f g h represent point mutations, whereas a b c f g h represent a deletion. A number of rII mutants are deletion mutants, which are missing small segments of the genome. Point mutants can back-mutate to wild-type, but deletion mutants never do because it would be nearly impossible to add back the right sequence of nucleotides by accident. Where comparable deletions are contained in mutants, recombination experiments do not yield wild types. Mutants that do not revert to the standard type when they reproduce contain same or overlapping deletions. To test thousands of mutants through cross infection would be cumbersome. The way-out would be to use mutants of the non-reverting type - whose BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 137 OF 153 deletions divide up the rII region into segments. Each point mutation is tested against these reference deletions. The recombination test gives a negative result if the deletion overlaps the point mutation and a positive result if it does not overlap. In this way a mutation is quickly located within a particular segment of the map. a. Test mutants rII region Test mutant A: Deleted Test mutant B: Deleted Test mutant C: Deleted Test mutant D: Del b. Point mutations testing, e.g. using mutant X. Test mutant A: Mutant X * No standard recombinants Test mutant B: Mutant X * No standard recombinants Test mutant C: Mutant X * Standard recombinants possible Test mutant D: Mutant X * Standard recombinants possible BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 138 OF 153 Benzer realized that deletions could be used for the rapid location of mutational sites in newly obtained mutants. Illustration Consider the following gene map showing 12 identifiable mutational sites: 1 2 3 4 5 6 7 8 9 10 11 12 I I I I I I I I I I I I i. One special mutant D1 fails to give rII+ recombinants when crossed with mutants carrying altered sites 1, 2, 3, 4, 5, 6, 7, or 8. Therefore, D1 behaves as if carries deletions of sites 1 to 8. 9 10 11 12 I I I I ii. Another special mutant D2 fails to give rII+ recombinants when crossed with mutants carrying altered sites 5,6,7,8,9,10,11 or 12. Therefore D2 behaves as if it involves a deletion of sites 5 to 12. 1 I 2 3 I 4 I I These overlapping deletions now define 3 areas of the gene, designated i, ii and iii. i ii iii D1 =====================---------------------------BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 139 OF 153 D2 ---------------------========================= Thus, (i) A new mutant that gives rII+ recombinants when crossed with D1 but not when crossed with D2 must have its mutational site in area iii; (ii) One that gives rII+ recombinants when crossed with D2 but not with D1 must have its mutational site in area I;and (iii) A mutant that does not give rII+ recombinants with either D1 or D2 must have its mutational site in area ii. Deletions themselves can be intercrossed and mapped just like point mutations. A bar represents the deleted region. If no wild-type recombinants are produced in a cross between different deletions, then the bars are shown as overlapping. Worked out Examples (1) The following deletion map shows four deletions (1-4) involving the rIIA cistron of T4: 1 -------------2 -----------------3 ----------------------4 --------------------------Five point mutations (a-e) are tested against these 4 deletion mutants for their ability to give wild-type (r+) recombinants; the results are: BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 140 OF 153 a b c d e 1 + + + + + 2 + + + - - 3 + - + - - 4 - - + - - What is the order of the point mutations? Answer The key principle here is that point mutations can recombine with deletions that do not extend past the mutation but cannot recombine to yield wildtype phages with deletions that do extend past the mutation. Looking at the test results given in the problem, any mutation that recombines with del. 1 must be to the right of the deletion, any mutation that recombines with deletion 2 must be to the right of deletion 2, etc. Therefore, (i) Consider point mutation “a”. It recombines with del. 1, 2 and 3 but not with del. 4. Therefore it is to the right of deletions 1, 2 and 3 but not to the right of del. 4. We can then easily place point mutation “a” in the interval between deletions 3 and 4. (ii)Point mutation “b” recombines with deletions 1 and 2 and therefore must be to the right of them. It does not recombine with deletions 3 and 4, so it is in the interval between deletions 2 and 3. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 141 OF 153 iii. Point mutation “c” recombines with all the deletions and thus is to the right of all deletions. iv. Finally both point mutations “d” and “e” recombine only with deletion 1 and must therefore be in the interval between deletions 1 and 2. The results therefore are as follows: 1. --------------- e, d 2. ---------------------- b 3. ---------------------------- a 4. ----------------------------------- c (2) In a phage, a set of deletions is intercrossed in pairwise combinations. The following results are obtained (a + = wild-type): 1 2 3 4 5 1 - + - + - 2 + - + + - 3 - + - - - 4 + + - - + BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 142 OF 153 5 - - - + - Using this table construct a deletion map Solution If a "+" implies non-overlap and a "-" an overlap, Then: i. 1 overlaps 3 and 5; ii. 2 overlaps 5 only; iii. 3 overlaps all but 2; iv. 4 overlaps 3 only; and v. 5 overlaps all but 4. Results I II III 4 ----------- 1 ----------- 2 ----------- 3 ------------------------ 5 ----------------------(3) Seven deletion mutants within the A cistron of the rII region of phage T4 were tested in all pair-wise combinations for wild-type recombinants. In the table of results below, + = recombination; 0 = no recombination. Construct a topological map for these deletions. 1 1 2 3 4 5 6 7 0 + 0 0 + 0 0 BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 143 OF 153 2 0 0 0 + + 0 0 0 + + 0 4 0 + 0 0 5 0 0 0 0 0 3 6 0 7 Solution Note: if any two deletions overlap to any extent, no recombinants (+). Thus, (i) Deletion 1 overlaps with 3,4,6,7 but not with 2 or 5. Thus: 1 2, 5 ---------- ------------ 3,4,6,7 ----------- (ii) Deletion 2 overlaps with 3, 4 and 7 but not with 1, 5 or 6. 1 2 ---------- ---------- 5 --------- 3,4,7 ----------------------6 --------BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 144 OF 153 (iii) Deletion 3 overlaps with 1, 2, 4 and 7 but not with 5 or 6. 1 ----------- 2 ----------- 6 3 --------- 5 ------------ ---------------4, 7 ---------------- (iv) Deletion 4 overlaps with 1, 2, 3, 6 and 7 but not with 5. 1 ---------- 2 ------------ 6 5 ---------- 3 ---------- ------------------4 --------------------------7 ------------------------------ (v) Deletion 5 overlaps with 6 and 7 but not with 1,2,3, or 4. 1 2 ----------- --------- 6 3 --------------5 ---------- ---------------------4 ----------------------------- BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 145 OF 153 7 ----------------------------------------- (4) Six deletion mutants in rIIA of T4 were tested in pairwise combinations for wild-type recombinants. The results are as follows: 1 2 1 2 3 4 5 6 0 0 0 0 0 0 0 0 0 0 + 0 + 0 0 0 + + 0 + 3 4 5 6 0 Key: + = recombination; 0 = no recombination Question: Construct a topological map for these deletions. Solution 1 -------------------------------2 --------------------3 ----------------BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 146 OF 153 4 5 ----------- 6 -------- -------- (5) The following map shows 4 deletions (1 -4) involving the rIIA cistron of T4: 1 ----------c e 2 ------------------d 3 a -------------b 4 ------ Five point mutations (a-e) in rIIA are tested against these 4 deletion mutants for their ability to give r+ recombinants, with the following results: a b c d e 1 + + - + + 2 + + - - - 3 - - + - + BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 147 OF 153 4 + - + + + What is the order of the point mutants? (See above, bold letters; i.e. order is c e d a b). (6) The following 5 histidine-requiring mutations of Neurospora were studied and allelic relationship determined thus: a. No complementation occurs between CD-16 and the other mutations. b. Mutants 245 and 261 complement and also complement with D-556 and 1438 but not with CD-16. c. Mutants D-556 and 1438 do not complement with each other but complement with 245 and 261. Complementation matrix: CD-16 245 CD-16 245 261 D-556 1438 0 0 0 0 0 0 + + + BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 148 OF 153 261 0 D-556 + + 0 0 1438 0 Key: 0 = no complementation; + = complementation Construct a gene map of this histidine locus Answer From the various deletion tests, 3 cistrons can be categorised. Orders and relative positions of various sections can be precisely determined by isolating mutations, which overlap two adjacent sections. (i) Complementation map I CD -16 245 II III ===================================== ============ BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 149 OF 153 261 =============== D-556 ========= 1438 ========= Genetic map CD-16 245 261 D-556 1438 4. Abortive transductants are relatively stable merozygotes that can be used for complementation tests. Six mutants were tested in all pairwise combinations, yielding the results shown below (+ = complementation, 0 = non-complementation). Construct a complementation map consistent with the data. 1 2 3 4 5 1 2 3 4 5 6 0 + 0 + + + 0 0 + + + 0 + + + 0 0 + 0 0 6 BOTA 413: MOLECULAR AND MICROBIAL GENETICS 0 PAGE 150 OF 153 Answer: 1 2 ______ _______ 3 _______________ 4 ______ 6 ________ 5 __________________ Topic Summary In this topic you have learnt all fundamental aspects of bacteria and viruses that make them essential tools in molecular genetics, especially in our understanding of the structure and function of genes. Though largely reproducing asexually, it was shown that genetic exchange could occur in bacteria through conjugation, transduction and transformation. It was also shown that viruses too could exchange genetic materials leading to creation of recombinants. Such was responsible for genetic diversity in the two. Using genetic systems, Benzer was able to map the details of the gene and thus the fine structure of the gene studies. From the recombinants generated in both bacteria and viruses, gene making in prokaryotes was demonstrated. Further Reading (1) Russell, P.J. (1992). Genetics (3/e). HarperCollins Publishers, NY (2) Suzuki, D.T., Griffiths, A.J.F., Miller, J.H., Lewontin, R.C. (1986). An Introduction to Genetic Analysis (3/e). W.H. Freeman & Co., NY BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 151 OF 153 Topic Activities 1. What reasons make viruses and bacteria suitable tools for molecular work? 2. What are the various modes of genetic exchange available to a bacterium? 3. Explain how it was shown that genetic exchange occurs in bacteria and viruses. 4. How was genetic exchange in bacteria shown to be unidirectional? 5. Benzer infected strain K of E. coli with rII mutants two at a time and plaque-assayed the lysates obtained. The following results were obtained. 47 51 101 102 104 47 51 101 102 104 106 0 + 0 + 0 0 0 + 0 + + 0 + 0 0 0 + + 0 0 106 0 Question: Does the rII region consist of one, two or more cistrons? Explain. BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 152 OF 153 BOTA 413: MOLECULAR AND MICROBIAL GENETICS PAGE 153 OF 153