Repeated DNA sequences

advertisement
Repeated DNA sequences 2
Prof Duncan Shaw
Molecular & Cell Biology
Lecture 2
Mini-satellites and DNA fingerprints
SINES and LINES - interspersed repeats
Functions of repeats
Mutagenesis by inter-repeat recombination
Mini-satellites and DNA fingerprints

Mini-satellites are tandemly repeated 15-30bp sequences, and are distributed throughout
mammalian and other genomes.

Individual loci are highly polymorphic in length, and a probe to the repeated sequence
can detect many loci at once

They have found use as "DNA fingerprints" since the pattern of alleles in any individual
person is virtually unique

They have many applications in forensics, identifying family relationships, etc.
The picture shows a Southern blot of DNA from different family members, probed
using a mini-satellite. You can work out which of F1 and F2 is the father of child C,
by observing which bands they have in common. (Reproduced from "Essential
Medical Genetics" by M.Connor and M.Ferguson-Smith, with permission from
Blackwell Science).
These sequences were discovered by Prof. Alec Jeffreys of Leicester University. He
was originally studying the seal myoglobin gene. This is an excellent example of
science driven by intellectual curiosity leading to an extremely valuable practical
application. Don't let anyone tell you that pure research has no useful benefits!
For another exercise in using these, click here.
SINES and LINES - interspersed repeats
There are 2 main types to consider in mammalian genomes - SINES (short
interspersed repeats) and LINES (long interspersed repeats).
SINES

Length 100-500bp

Copy number up to 1,000,000

In primates, main type is called "Alu repeat" as it has a site for the AluI restriction
enzyme. It is about 300bp long

Some SINEs are are homologous to small cytoplasmic RNAs including tRNA and
7slRNA. They may be processed pseudogenes derived from these RNAs
The structure of 2 typical SINEs and their homology
to 7slRNA. Each differently shaded box represents a
segment of conserved sequence.
Many SINEs are flanked by short direct repeat
sequences. This suggests that they could have originated
by insertion of a transposable DNA element.
LINES
LINEs are up to 7kb in length. Their copy number is
from 4000 to 100,000 depending on the exact type.
Their structures suggest that they were all derived from
an original full length version and that many have since
undergone deletion of the 5' end.
Like SINEs, LINEs may have originated as transposable
elements. This may have coded for its own reverse
transcriptase as some LINEs have an open reading frame
with homology to that enzyme. This would
provide a mechanism for mobility in the genome:

Gene containing LINE is transcribed

RNA is then reverse transcribed

DNA copy of LINE is inserted into a new
genomic locus, as in the previous picture
We already saw that the sequences of the rRNA
repeats are more conserved than would be
expected if they were evolving independently, and that they were subject to unequal crossingover to create different copy numbers. The sequences of interspersed repeats are also more
conserved than would be expected for sequences that don't code for protein. But it's not possible
to have unequal crossover between intersersed repeats, since this would mess up the organisation
of all the DNA between the repeats (draw a diagram to see why). A possible mechanism to allow
homogenisation of interspersed repeats is gene conversion.
This picture shows yet another yeast experiment to
illustrate gene conversion between interspersed copies of
the yeast repeat sequence Ty. One copy (Ty) has a Ura3
gene inserted into it. Ty' is a slightly different version of
the sequence. If you screen this strain for mutants that
have become Ura-, then look at the sequence of the locus
that used to have Ura3 in it, you find that it has been
converted to the sequence of Ty'. The mechanism for
doing this, gene conversion, can be seen in a previous
lecture. Return by using "back" button on browser.
Functions for interspersed repeats?

Some very short repeats (<10bp) are found in
promoters of genes, where they function in gene
regulation (e.g. binding sites for transcription factors)

But there is no definite function known for SINEs or LINEs, even though they are present
in the primary transcripts of some genes

So, maybe they are truly "selfish DNA"; their abundance is due to their "reproductive
success", i.e. their ability to multiply and disperse themselves through a genome

For more on the subject of Richard Dawkins, the evolutionary biologist at Oxford
University Zoology Dept who coined the term selfish DNA, you could have a look at
this.
Bad effects of SINEs
Mutations in the low-density lipoprotein receptor gene (LDLR) are a common genetic cause of
heart disease due to hypercholesterolemia.
The LDLR gene is 45kb long. Several Alu repeats
are found in its introns and untranslated regions. In
one case it was found that a mutation had occurred
by recombination between 2 of these Alus, leading
to a truncated gene and defective protein.
Another example is the disease neurofibromatosis
type 1, where a mutation due to insertion of an Alu
has been reported. You can find out more from the
Human Mutation website, by searching for NF1
then looking at the entries for "gross insertions & duplications".
Download