Microbial Genome Organisation Lecture 6 - DNA re-arrangements and gene expression Prof. Duncan Shaw The material for this lecture is from these sources: "Genes and Genomes" by M. Singer and P. Berg, University Science Books 1991, Chapter 10 Weiser JN et al (1989), "The molecular mechanism of phase variation of H. Influenzae lipopolysaccharide"; Cell 59, 657-665 Hammerschmidt S et al (1996), "Capsule phase variation in Neisseria meningitidis"; Molec.Microbiol. 20, 1211-1220 Borst P & Greaves DR (1987), "Programmed gene rearrangements altering gene expression"; Science 235, 658-667 Some further reading: a review of the adaptive mutation hypothesis: Rosenberg SM (1994), "In pursuit of a molecular mechanism for adaptive mutation"; Genome 37, 893-899 Click on one of these links to move to a specific topic, or browse through the whole lecture: Introduction DNA re-arrangements Yeast mating-types The cassette model Other examples Slipped strand mispairing My home page Introduction You have already had lectures on genome mapping of various kinds, and next term those of you who take the module "Chromosome Organisation and development" will hear more about the organisation of genomes. Mostly, these lectures will have used the paradigm of a genome as an entity whose structure is stable. But there are several important ways in which the organisation of a genome can change (as well as mutation, etc). Some classes of DNA re-arrangements result in changes of gene expression and hence of the phenotype of the organism. It's this type of change that will be covered in this lecture. DNA re-arrangement - programmed and unprogrammed There is a distinction to be made between programmed and unprogrammed DNA re-arrangements. Programmed re-arrangements appear to have a "purpose" - they are involved in the regulation of gene expression, maybe in response to an external stimulus. Examples include immunoglobulin and T-cell receptor genes, phase variation in bacteria such as Salmonella, Neisseria and Haemophilus, and yeast mating-type switching. Unprogrammed re-arrangements could be viewed as a class of mutation. They include duplication and transposition of repeated sequences, transposons, and viral genomes, and translocations between chromosomes in eukaryotes. If this affects gene expression, it is random and non-specific (and will usually result in the loss of the gene's function). Just to confuse things, I should say that this distinction is arbitrary and arises from scientists' need to classify and assign a purpose to events. The same molecular mechanism may underlie both programmed and unprogrammed DNA rearrangements, but the outcomes can be quite different. Yeast mating-type switching Many yeasts, including Saccharomyces cerevisiae (the example that is used here) can exist as haploid or diploid forms. Diploids are heterozygous for the matingtype locus, MAT, and haploid cells can be either MATa or MATalpha. This picture shows the life cycle of the yeast. The red and blue cells are a and alpha types. The difference between homo- and hetero-thallic is that the former have an active HO gene and can switch spontaneously between mating type, and therefore a single spore can give rise to a self-fertile population, whereas the latter do not have HO and maintain the same mating type during the haploid growth cycle. Genetic analysis (i.e. the use of mutants) shows that the following genes are required for mating-type switching to occur: MAT HO, which codes for an endonuclease HMLalpha (for MATa -> MATalpha switch) HMRa (for MATalpha -> MATa switch) All except HO are on yeast chromosome 3. The "cassette" model was proposed to explain mating-type switching, which occurs at too high a frequency to be due to normal mutation events. The model proposes that the HMLalpha and HMRa loci contain "silent copies" of alpha and a mating type genes. Replicas of either can be copied into MAT, the active locus, where they are expressed. This model was supported by the following experiments: 1. Southern blot analysis. The MATalpha allele was cloned by complementation of a MATalpha-, HO- yeast mutant. This was used as a probe on Southern blots of yeast DNA to investigate the structures of the MAT, HMLalpha and HMRa loci: The identity of all these genes was confirmed by cloning and sequencing. The experiment shows that the MAT locus, which is expressing the genetic information in it, changes in length between a and alpha forms, but the silent HMLalpha and HMRa loci stay the same. 2. Electron microscopy of DNA hybrids To see which parts of each gene were homologous and which were different, all pairs of genes were mixed, denatured, reannealed and visualised by EM: The cassette model These and other data were put together into the cassette model: The gene products shown are regulatory proteins, that control the ability of the yeast to mate (and other aspects of its phenotype). alpha1 is a positive regulator, that switches on genes required for the alpha phenotype, including alpha factor, a secreted pheromone alpha2 is a negative regulator that turns off a-specific genes in diploid cells, a1 and alpha2 combine to inhibit alpha1 (and hence all the genes it regulates) and repress HO, and turn on the meiosis pathway if the diploid cells are starved Why aren't HML and HMR expressed all the time? They have the same DNA sequence as MAT which is expressed. The reason is that the HML and HMR loci are "silenced" by the products of the SIR genes (Silent Information Regulators). These proteins interact with regions of DNA ~1000bp upstream of the loci that are transposed (HML and HMR). These DNA sequences are called "silencers". They can be turned around or moved up to 2.5kb away and they are still active. So in the cassette model, transposition moves the genes from a transcriptionally silent site to a transcriptionally active one. Silencing is believed to act through a localised change in the structure of chromatin. The mechanism by which the genetic information in MAT is replaced by HML or HMR is known as "gene conversion". It is a process that is found in many aspects of genetics and it works as shown below. The initial event is cutting of the DNA at specific sites by the HO endonuclease: As you can see, the outcome of this process is that the original DNA sequence (blue) is replaced by the homologous but different sequence (red), but the red sequence itself is left unchanged. Other examples of gene regulation by DNA re-arrangement Trypanosomes Trypanosomes are protozoan organisms that can live as parasites in either the tsetse fly or mammals (including humans, cattle). They cause sleeping sickness in humans. This is the life cycle of the trypanosome. VSG stands for variable surface glycoprotein, which is a major component of the coat of the organism, and the main antigenic determinant. During the course of the life cycle, the VSG can switch between a number of different types. During the course of infection, about 100 different VSG types can be produced. The trypanosome cycles through a series of types and this keeps it one step ahead of the host's immune defences. Altogether, there are about 1000 different VSG genes in the trypanosome's genome, but at a given time, only one gene is being actively transcribed. The rate of switching between types is about 1/1,000,000 per cell division. The repertoire of VSG types produced by a single population of trypanosomes is called a serodeme. The diagram shows how different VSG genes become the active one. A VSG gene is transcribed when it is in an "active site", close to a telomere. All the other VSG genes are silent. To be activated, a VSG gene must be transposed from a silent site to an active site, close to a telomere. When this happens, the gene that was previously at the active site is lost. This is believed to happen by gene conversion, as described above for yeast mating types. There are regions of homology upstream and downstream from each VSG gene, that initiate the gene conversion process. The upstream region of homology includes a few copies of a 70bp repeat. Gene conversion is the most likely mechanism because (1) the gene copy at the active site is lost and (2) the amount of upstream and downstream DNA that is transposed can vary between different occurrences of the same gene replacement. Although there are several sites close to telomeres where VSG genes can be transposed, not all of these sites are active. Why should this be, since there are no obvious differences between the actual DNA sequence at these sites? The answer is not known for sure but may involve silencing due to modification of C bases in the non-active sites, or changes in chromatin structure that inhibit access to transcription factors. Compare this with yeast mating-type switching. Phase variation in Haemophilus influenzae H influenzae is a gram negative bacterium, that infects the respiratory tract where it can be involved in septicemia and meningitis It has a cell-surface lipopolysaccharide (LPS) that can have different structural forms The LPS switches between forms at a rate of 1/100 per cell division - this is called phase variation Genetic experiments showed that a locus, licABCD, is required for phase variation This shows the first few bases of the licA gene. It contains a number of tandem repeats of the tetranucleotide CAAT (N = between 27 and 32 copies). When the gene from H influenzae strains with different LPS was sequenced, it was found that they also had different CAAT numbers Different CAAT numbers cause the reading frame of the protein in the upstream region to be shifted. This diagram shows some examples. In some cases (e.g N=29) there are no in-phase Met codons upstream and so no protein is produced. N=30 or 31 produce protein forms that differ in their N-terminal regions. Phase variation in Neisseria meningitidis N meningitidis is another human pathogen, the invasive form of which is associated with meningitis. It has 12 different antigenic forms, that differ due to variation in the polysaccharide capsule The gene involved is siaD, which codes for an enzyme of polysaccharide biosynthesis When the siaD gene was sequenced from N. meningitidis strains of different pathogenicity, a stretch of repeated Cs was found near the 5' end. In the wild-type strain there are 7 Cs and a functional protein is made. In mutants with different polysaccharide capsules, there are different numbers of Cs so the reading frame of the protein is disrupted, and a non-functional protein is made. It is possible to get revertants to wild-type from a mutant strain, and in these the number of Cs has gone back to 7. A common molecular mechanism - slipped-strand mispairing Both of the last 2 examples (H. influenzae and N. meningitidis) have something in common - the mechanisms of phase variation are controlled by a DNA sequence that is a simple sequence repeat (SSR), in one case a tetranucleotide repeat, and in the other a mononucleotide. SSRs are known to be prone to a high rate of mutation via a mechanism called slipped strand mispairing (or replication slippage). This picture illustrates slipped-strand mispairing. The second frame shows the 2 DNA strands dissociating, e.g. during DNA replication. Because it is a repeated DNA sequence, it can re-anneal in either the correct way or shifted as in the 3rd frame. The mispaired DNA sequence is recognised as a replication error by the DNA repair system. One way in which it could be repaired is by nicking both strands and inserting an extra base opposite each mispaired base (4th frame). There are some other examples of mutations in SSRs that cause a change in phenotype in human genetic disease, for example. This will be covered in the Honours module on Chromosome Organisation and Development. It is possible, but not proven, that under conditions where a mutation in a SSR would be favourable, e.g. to change the antigenic properties of the organism, the cell might up-regulate its DNA repair system to cause the process shown in the diagram to be accelerated. There is an interesting (though controversial) example of this in the phenomenon of adaptive mutation, where starved, non-dividing bacteria can acquire mutations that might allow them to start growing again. The end of the lecture!