Unit 1 Cell and Molecular Biology Section 8 The human genome project History Late 1980’s idea was proposed Predicted it would take 15 years Cost about $200 million per year $1 per base pair Officially began in 1990 26 June 2000 joint announcement from Blair and Clinton ‘the draft complete’ Joint publication in Nature and Science 12 Feb 2001 14 Apr 2003 – The finished human genome Method Genetic Mapping Physical Mapping Identifies relative positions of genes E.g. Gene 2 lies between genes 1 and 3 Absolute positions of genes on chromosomes E.g. Gene 2 is 1 million bp from gene 1 DNA sequencing Actual ATCG combinations The human genome The total genetic complement of a cell Has 3 billion (3 x 109) base pairs Comprises 23 pairs of chromosomes How did Biologists go about sequencing the genome ? Producing copies by PCR In order to sequence DNA it is necessary to produce a huge number of exact copies of the original stands. The technique used to do this is known as PCR or ‘Polymerase Chain Reaction’. Once the copies of DNA have been produced they can be analysed. Note – This is the technique used by forensics to amplify tiny samples of DNA for ‘fingerprinting’ The polymerase Chain Reaction DNA duplex (two strands) 5' 3' 3' 5' denature 5' 3' primer primer 3' 5' 5' 3' Taq 4 synthesis Taq 3' 5' etc. 8 16 repeat cycle Understanding PCR 1. PCR can amplify any DNA sequence hundreds of millions of times in just a few hours. It is especially useful because it is highly specific, easily automated and capable of amplifying minute amounts of sample 2. The whole process is only possible because of a special heatstable enzyme called Taq polymerase, isolated from thermophilic bacteria. 3. The enzyme Tac polymerase is able to tolerate temperatures of 95C and has a temperature optimum of 72C. 4. This enzyme can synthesise the complementary strand of a given DNA strand in a mixture containing the four DNA nucleotide bases and two short DNA fragments called primers. Each primer is usually about 20 base pairs (bp) long. The primers are designed to bind to the DNA at either side of the target sequence. Procedure Step 1 – The DNA is heated to 950 C breaking the hydrogen bonds and separating the strands. Step 2 – The strands are cooled to between 55 – 700C and the primers added. Step 3 – The strands are heated to between 70 – 72 0C so that Taq Polymerase can copy each strand from the point of the primer. Summary PCR requires the following: Template DNA Primers – starting points for the construction of new strands Taq Polymerase – a polymerase enzyme which works at high temperatures Supply of nucleotides PCR can amplify a single strand of DNA by a factor of millions Genome mapping genetic map - allows relative positions (a) II of heterozygous linked alleles to be located I I I I I I I I I physical map - allows precise location of (b) specific DNA sequence to be located. clone map (c) Sequence -Allows sequence of nucleotides to be determined AGGTCGCGATGCTA Genetic linkage mapping 1. Linkage mapping can be used to locate genes on particular chromosomes and establish the order of these genes and the approximate distances between them. 2. This idea is based on the fact that the further apart linked genes are on chromosomes the more likely crossing over will take place resulting in more recombinants being formed. 3. The greater the number of recombinants, the further apart linked genes are on a chromosome. Working out the number of recombinants relative to the parental genotypes gives a percentage value which is called the recombination frequency. Example: Linked Gene Pair Recombination Frequency % AB 11 AC 7 BC 18 The information from the table can be used to produce a chromosome map. 11 units B 7 units A 18 units C Genetic Linkage Mapping relies on having genetic markers that are detectable – sometimes these are genes that cause disease, traced in families by pedigree analysis. The marker alleles must be heterozygous and be linked on the same chromosome so that recombination can be detected. The overall result of genetic mapping is to produce a picture of the locations of the marker loci on the chromosomes – rather like establishing the order of the cities and large town between two points on a map. Physical Mapping 1. Physical mapping is required to add some more of the detail to what is obtained by genetic mapping 2. As with genetic maps, construction of a physical map requires markers that can be mapped to a precise location on the DNA sequence. 3. The distance between markers is usually expressed as a number of nucleotides in a physical map. 4. A physical map can be made by isolating DNA from a chromosome then cutting it using restriction enzymes (also known as restriction endonucleases) to construct a pattern. 5. Different restriction enzymes cut the DNA at different points as each recognises a particular short sequence of bases occurring in the DNA. Where the sequence is recognised, the enzyme cuts the DNA so that it is cut into fragments. 6. By using combinations of restriction enzymes and working out the size of the fragments it is possible to recognise a pattern. The fragments can be identified by their size or by using a specific DNA probe to bind to its complementary sequence. Physical restriction mapping example 1. 2. Genes do not exist as separate entities but as part of the larger DNA molecule DNA can be broken up into fragments by ENDONUCLEASES which cut at specific base sequences - look up page 316 of your textbook for more examples G G C C CUT FRAGMENT G G C C Not 1 - always cuts at the following sequence of eight pairs. This happens on average every 65536 nucleotide pairs ( 1 in 48 ) and produces much larger fragments C C G C G C G G C CC G GC C G G G G C G G C G G C C G C C C G The resulting fragments can be separated using GEL ELECTROPHORESIS and used to physically map the sections of DNA - See textbook p317 The following table shows the fragments produced from a 15kbp fragment C 14 1 C C G Bam H1 EcoR1 Pst1 Bam H1 plus EcoR1 Bam H1 plus Pst1 EcoR1 plus Pst1 12 3 8 7 11 3 1 8 6 1 7 5 3 G C G C G G G C G G C C Bam H1 plus EcoR1 plus Pst1 6 5 3 1 ANALYSING THE FRAGMENTS 3 0 9 6 15 12 kbp A) BAM H1 (14 +1) 14 1 B) EcoR1 (12 +3) 12 3 12 Option 1 3 Option 2 C) BAM H1 plus Eco R1 (11 +3 +1) 3 11 12 1 2 1 Option 1 Option 2 Option 2 for EcoR1 above would give 12 +2 +1 with the double digest which is INCORRECT. This means that the fragment must have been cut with the following orientation EcoR1 BAM H1 By repeating this procedure it is possible to build up a RESTRICTION MAP for this section of DNA - this lets us know the base sequence at each point of cut, 3 0 15 12 9 6 kbp A) BAM H1 (14 +1) 14 1 B) Pst 1 (8 +7) 8 7 Option 1 8 7 Option 2 C) BAM H1 plus Pst1 (8 +6 +1) 8 6 1 7 7 1 Option 1 Option 2 Option 2 for EcoR1 above would give 7 +7 +1 with the double digest which is INCORRECT. This means that the fragment must have been cut with the following orientation EcoR1 Pst 1 BAM H1 Checking the results of the triple digest EcoR1 plus Pst1 plus Bam H1 show that are map is correct. •Since each cut with a known enzyme is a specific base sequence comparing restriction maps allows biologists to look for the numbers and locations of these base sequences. The theory is that the greater the number of sequences and the closer their location on the DNA the more closely related the individuals In the following example three endonucleases have been used (1-3) and have cut the strands at the points shown. The reults indicate that individuals A and B are more closely related that individuals A and C or B and C A 1 2 3 B 1 2 3 C 1 3 2 Physical Mapping relies on the availability of many copies of the DNA fragment. This is only possible because of the technique known as POLYMERASE CHAIN REACTION or PCR which allows many copies of the section of DNA to be produced DNA sequencing 1. The final stage of the genome project is to determine and assemble the actual DNA sequence itself. 2. There are several critical requirements for this part:a. Single stranded DNA fragments must be generated as the templates; b. sequencing technology must be accurate and fast; c. computer hardware and software must be available to analyse the sequence data. 3. The technique used for sequencing is called dideoxy chaintermination method. 4. This method relies on making a copy of the DNA template to be sequenced using:a. a DNA polymerase; b. a primer; c. the four dNTPs (deoxyribonucleoside triphosphates dATP, dCTP, dTTP and dGTP) to extend the chain; d. a labelled dNTP; nowadays using a fluorescent dye rather than a radioactive element as used in the past. 1. In the correct conditions the polymerase can make a copy of the DNA by a process that is essentially the same as that used in DNA replication. 2. The chain termination part is what makes the key difference from normal DNA replication. This involves setting up four separate reactions, each including one of the four dideoxy NTPs (ddATP, ddGTP, ddCTP and ddTTP). 3. These modified nucleotides cannot form the next phosphodiester bond in the growing chain – hence when a ddNTP is incorporated into the copy, it terminates the process. 4. The large number of fragments that are produced in the four reactions produce a set of sequences that differ in length by one base, and end with a particular ddNTP. deoxyribonucleoside triphosphate Allows strand extension at 3’ end dideoxyribonucleoside triphosphate Prevents strand extension at 3’ end Normal deoxyribonucleoside triphosphate precursors (dATP, dCTP,dGTP, and dTTP) OLIGONUCLEOTIDE primer for DNA polymerase Small amount of one dideoxyribonucleoside A Rare incorporation of dideoxyribonucleoside by DNA polymerase blocks further growth of the DNA molecule 5’ 3’ GCTACCTGCATGGA CGATGGACGTACCTCTGAAGCG 5’ single-stranded DNA molecule to be sequenced The fragments can be separated using gel electrophoresis -to see animated version of this go ‘DNA sequencing by enzyme methods’ click here Separating DNA fragments: DNA fragments can be separated by gel electrophoresis Largest fragments Smallest fragments + gel with DNA fragments DNA moves to the positive terminal due to it’s overall negative charge DNA sequencing: separating the DNA fragments The different fluorescent labels in the copied DNA strand are detected as they come off the bottom of the gel. Automated DNA sequencing This gives a direct readout of the sequence and the process can be automated so that it is much faster than by conventional sequencing. A C T G A C (d) gel (c) (a) (b) To see an animated version of this process go to the ‘cycle sequencing’ section on the following website of the Biology Animation Library http://www.dnalc.org/resources/BiologyAnima tionLibrary.htm Activity Look at the arrangements document to clarify what information is required. Read DART pg 73 – 81. Read the Monograph pg 67 – 79 Scholar – 8 Internet research