Non-coding DNA accounts for more than 98% of the human genome and is defined as DNA sequences within a genome that are not protein coding. In other words, these DNA bases are never represented within the amino acid sequence of expressed proteins. This does not mean that non-coding DNA does not have a function. DNA is 'expensive' to maintain, so, from an evolutionary viewpoint, it should have a function, but we do not always know what that function is. Functions of non-coding DNA The regions of DNA that do not code for proteins include: Regulators of gene expression: These are DNA sequences that regulate gene expression in various ways. For instances, promoters are sequences that occur just before genes and act as binding point for the RNA polymerase enzymes that catalyse the transcription process. Other DNA sequences may act as binding sites for proteins that either increase or decrease the rate of transcription; these are known as enhancers and silencers, respectively. Introns: These are DNA base sequences found within genes that get removed at the end of transcription. They do not contribute to the amino acid sequence of the polypeptide made from the gene. Telomeres: These are repetitive sequences that protect the ends of the chromosome (see Figure 1). With every cell division, short stretches of DNA are lost from the telomeres. Genes for tRNAs: These genes code for RNA molecules that do not get transcribed, but instead fold to form tRNA molecules that play an important role in translation. Tandem repeats and DNA profiling Tandem repeats are a sequence of two or more DNA base pairs that is repeated in such a way that the repeats lie end-to-end on the chromosome as shown in Figure 2. They generally form part of non-coding DNA, though they may be present in protein coding regions. A tandem repeat located at a single genetic locus, in which the number of repeated DNA segments varies from individual to individual, is frequently used for identification in DNA fingerprinting. Figure 2. Variation of tandem repeats in three different individuals. DNA profiling, also called DNA fingerprinting, DNA testing or DNA typing is a technique used to identify individuals by analysing their DNA. As tandem repeats vary among individuals, differences in these regions can be analysed to produce a DNA profile (see Figure 3). Human DNA profiles have various uses including the identification of the origin of a DNA sample from a crime scene, tests for parentage and in genealogical research. DNA profiling involves the following steps: • • • Collection of samples and extraction of DNA Amplification of the DNA region containing tandem repeats by PCR Separation of the DNA fragments by using gel electrophoresis. Figure 3 shows how two individuals differ in the number of tandem repeats and how that is visualised (shown) on a gel. Figure 3. DNA profile of two individuals. In Individual #1, you can see that for Allele A5 there are five repeats of the GC tandem. When this sequence is amplified by PCR, it creates a larger DNA fragment than when there are only two GC repeats, as can be seen for allele A2. These differences can be clearly seen on the gel. DNA sequencing DNA sequencing is used for many purposes: DNA profiling, paternity suits, forensics, cancer analysis and genome studies. Frederick Sanger published the principles of DNA sequencing in 1977. He was awarded his second Nobel prize for chemistry 3 years later. His technique is called the Dideoxy Chain Termination Method. It is based on the fact that DNA polymerase needs a 3' OH group of the preceding nucleotide to add another nucleotide to the DNA strand. If a dideoxy nucleotide (a 'normal' DNA nucleotide but lacking the 3' OH group) is added to the mixture, and this nucleotide is built into the growing DNA strand, no further nucleotides can be added and the reaction stops. Modern DNA sequencers make use of this method and add a fluorescent dye to the four dideoxynucleotides so that the base present when replication stops can be recognised. From this, the base on the parent strand is deduced. The diagram shown in Figure 1 demonstrates the basics of the technique: Figure 1. Dideoxy chain terminating sequencing method.