Repeated DNA sequences 2 Prof Duncan Shaw Molecular & Cell Biology Lecture 2 Mini-satellites and DNA fingerprints SINES and LINES - interspersed repeats Functions of repeats Mutagenesis by inter-repeat recombination Mini-satellites and DNA fingerprints Mini-satellites are tandemly repeated 15-30bp sequences, and are distributed throughout mammalian and other genomes. Individual loci are highly polymorphic in length, and a probe to the repeated sequence can detect many loci at once They have found use as "DNA fingerprints" since the pattern of alleles in any individual person is virtually unique They have many applications in forensics, identifying family relationships, etc. The picture shows a Southern blot of DNA from different family members, probed using a mini-satellite. You can work out which of F1 and F2 is the father of child C, by observing which bands they have in common. (Reproduced from "Essential Medical Genetics" by M.Connor and M.Ferguson-Smith, with permission from Blackwell Science). These sequences were discovered by Prof. Alec Jeffreys of Leicester University. He was originally studying the seal myoglobin gene. This is an excellent example of science driven by intellectual curiosity leading to an extremely valuable practical application. Don't let anyone tell you that pure research has no useful benefits! For another exercise in using these, click here. SINES and LINES - interspersed repeats There are 2 main types to consider in mammalian genomes - SINES (short interspersed repeats) and LINES (long interspersed repeats). SINES Length 100-500bp Copy number up to 1,000,000 In primates, main type is called "Alu repeat" as it has a site for the AluI restriction enzyme. It is about 300bp long Some SINEs are are homologous to small cytoplasmic RNAs including tRNA and 7slRNA. They may be processed pseudogenes derived from these RNAs The structure of 2 typical SINEs and their homology to 7slRNA. Each differently shaded box represents a segment of conserved sequence. Many SINEs are flanked by short direct repeat sequences. This suggests that they could have originated by insertion of a transposable DNA element. LINES LINEs are up to 7kb in length. Their copy number is from 4000 to 100,000 depending on the exact type. Their structures suggest that they were all derived from an original full length version and that many have since undergone deletion of the 5' end. Like SINEs, LINEs may have originated as transposable elements. This may have coded for its own reverse transcriptase as some LINEs have an open reading frame with homology to that enzyme. This would provide a mechanism for mobility in the genome: Gene containing LINE is transcribed RNA is then reverse transcribed DNA copy of LINE is inserted into a new genomic locus, as in the previous picture We already saw that the sequences of the rRNA repeats are more conserved than would be expected if they were evolving independently, and that they were subject to unequal crossingover to create different copy numbers. The sequences of interspersed repeats are also more conserved than would be expected for sequences that don't code for protein. But it's not possible to have unequal crossover between intersersed repeats, since this would mess up the organisation of all the DNA between the repeats (draw a diagram to see why). A possible mechanism to allow homogenisation of interspersed repeats is gene conversion. This picture shows yet another yeast experiment to illustrate gene conversion between interspersed copies of the yeast repeat sequence Ty. One copy (Ty) has a Ura3 gene inserted into it. Ty' is a slightly different version of the sequence. If you screen this strain for mutants that have become Ura-, then look at the sequence of the locus that used to have Ura3 in it, you find that it has been converted to the sequence of Ty'. The mechanism for doing this, gene conversion, can be seen in a previous lecture. Return by using "back" button on browser. Functions for interspersed repeats? Some very short repeats (<10bp) are found in promoters of genes, where they function in gene regulation (e.g. binding sites for transcription factors) But there is no definite function known for SINEs or LINEs, even though they are present in the primary transcripts of some genes So, maybe they are truly "selfish DNA"; their abundance is due to their "reproductive success", i.e. their ability to multiply and disperse themselves through a genome For more on the subject of Richard Dawkins, the evolutionary biologist at Oxford University Zoology Dept who coined the term selfish DNA, you could have a look at this. Bad effects of SINEs Mutations in the low-density lipoprotein receptor gene (LDLR) are a common genetic cause of heart disease due to hypercholesterolemia. The LDLR gene is 45kb long. Several Alu repeats are found in its introns and untranslated regions. In one case it was found that a mutation had occurred by recombination between 2 of these Alus, leading to a truncated gene and defective protein. Another example is the disease neurofibromatosis type 1, where a mutation due to insertion of an Alu has been reported. You can find out more from the Human Mutation website, by searching for NF1 then looking at the entries for "gross insertions & duplications".