(Huntington’s Disease) Brain Journal, 2004, Everett & Wood, pp. 2385-2405 Brain Journal, 2004, Everett & Wood, pp. 2385-2405 Different regions of brain affected by Triple Repeat Diseases (Page 413, book) Isolation Of The Gene Implicated In Spinocerebellar Ataxia Type-1 From Five Primate Species S.J. Richards, E.B. Whitledge, J.M. Lau, and D.L. Robinson Department of Biology, Bellarmine University, 2001 Newburg Rd, Louisville, KY 40205 Introduction Initial SCA-1 PCR Spinocerebellar Ataxia Type-1 (SCA-1) is a rare, dominantly-inherited, neurodegenerative disease that results from a tri-nucleotide (CAG) expansion. Six to thirty-nine CAG repeats occur in the Ataxin-1 gene of healthy people.3 1 What makes this neurodegenerative disease unique is ‘anticipation’. As the mutant gene is passed from generation to generation the age of onset of symptoms decreases as a result of enlargement of the poly-Q (polyglutamine) region.2 2 3 4 400 bp 300 bp CAG Repeats In The SCA-1 Gene of Healthy Animals Second SCA-1 PCR Lane: Lane: • 100 bp MW ladder • 100 bp MW ladder • Human DNA • Macaque • Lemur DNA • Orangutan • Macaque DNA • Vervet Species The expansion of the poly-Q repeat region occurs during DNA replication, and can reach as high as 81 CAG repeats in the Ataxin-1 gene of people with SCA-1. As a result, the protein Ataxin-1 gains a toxic function that results in the eventual death of the Purkinje cells of the cerebellum and spinal cord. The length of the poly-Q region is negatively correlated with the age of onset. 100 bp Transformation After PCR amplification, UA cloning was used to ligate the PCR product into a plasmid vector (Qiagen). EcoR1 restriction site EcoR1 restriction site NORMAL Purkinje Cell ABNORMAL Purkinje Cell Objectives Isolate and sequence the CAG repeat region of the Ataxin-1 gene from five primate species: Macaca assamensis (Assamese Macaque) Cercopithecus aethiops (Vervet Monkey) Eulemur macaco (Black Lemur) Pongo pygmaeus (Orangutan) Lagothrix lagothrica (Woolly Monkey) Search various DNA databases for other SCA-1 sequences, and use that information to examine the evolution of the CAG repeat region. Human 29 Orangutan 24 Chimpanzee 23 Gorilla 22 Orangutan 16 • Wooly Monkey 200 bp • 100 bp MW ladder Number of CAG Repeats Heat-shock transformation was then used to incorporate the plasmid vector into bacterial cells. Transformants were selected using blue-white screening. Vector plasmid was purified, quantified, and a sample was digested using EcoRI to remove the insert from the plasmid. Lemur SCA-1 EcoRI Digestion Multiple sequence alignments of the CAG repeat region from 14 species were analyzed using the ClustalW program (Vector NTI). Assamese Macaque 14 This program allows for analysis of this region in an evolutionary context. Projected Phylogenic Relationship Vervet 13 Bonnet Macaque 13 Lemur 4 Dog 4 Lane: 1 6 2 3 4 5 (1) 100 bp MW Ladder (2) Lemur SCA-1 [3] (3) Lemur SCA-1 [4] 300 bp 200 bp (4) Lemur SCA-1 [7] (5) Lemur SCA-1 [8] (6) 100 bp MW Ladder Conclusions We have successfully isolated the CAG repeat region of SCA-1 from 5 primates. Methods Primer design and optimization. PCR amplification and purification. Ligation to a plasmid vector. Heat-shock transformation into competent cells. Plasmid prep and purification. Sequencing. The number of CAG repeats in healthy primates ranged from 2 to 29. 1 2 3 4 5 ClustalW analysis of the CAG repeat region showed fairly predictable phylogenetic relationships Woollybetween Monkeythe species examined. 2 DNA Sequencing Transformed plasmid samples were sent to the University of Louisville for sequencing. Sequences were analyzed using Vector NTI (Version 10.0). The closer the evolutionary relationship a species has to humans, the more CAG repeats appear in the SCA-1 gene. House Mouse 2 It appears that a greater number of CAG repeats begin to appear in primates as the development fine motor skills and dexterity become refined. PCR PCR primers were designed using published DNA sequences for the Ataxin-1 gene in both Chimpanzee and Human (NCBI GenBank). Forward 5’-ACCTATGCCAGCTTCATCCCATC-3’; TM: 59.0˚ C Reverse 5’-GTCATGCAGGTGTAAAGGTCAAGA-3’; T M: 56.8˚ C DNA was extracted from blood or muscle tissue from healthy animals. Five ng of DNA template was used for PCR. References Zoghbi. 1994. Identification and characterization of the gene causing type 1 spinocerebellar ataxia. Nature 7: 513-520. Norwegian Mouse 2 Acknowledgements Chicken 2 their help. We would like to thank Dr. Roy Burns and the Louisville Zoo for 2 Orr, H.T., M.Y. Chung, S. Banfi, T.J. Kwiatowski, Jr., A. Servadio, A.L. Beaudet, A.E. McCall, L.A. Duvick, L.P.W. Ranum and H.Y. Zoghbi. 1993. Expansion of an unstable CAG repeat in spinocerebellar ataxia type 1. Nature 4: 221-226. 3 PCR conditions: initial denaturing at 95 C for 5 min, followed by 35 cycles at 95 for 1:20 min, 52 for 1:43 min, 73 for 1:20 min, and 6 min extension at 73. Vector DNA 1 Banfi, S., A. Servadio, M.Y. Chung, T.J. Kwiatowski, Jr., A.E. McCall, L.A. Duvick, Y. Shen, E.J. Roth, H.T. Orr and H.Y. Everett C.M., N.W Wood. 2004. Trinucleotide Repeats and Neurodegenerative Diseases. Brain. 127: 2385-2405. This project was partially supported by NIH Grant Number P20 RR16481 from the BRIN Program of the National Center for Research Resources. We would also like to thank Dr. Ric Devor, Integrated DNA Technologies INC., Coralville, IA. We would also like to thank Dr. Steven Wilt, Bellarmine University. Species Number of CAG Repeats In SCA-1 Gene Human 29 Orangutan 24 Chimpanzee 23 Gorilla 22 Orangutan 16 Assamese Macaque 14 Vervet 13 Bonnet Macaque 13 Lemur 4 Dog 4 Woolly Monkey 2 House Mouse 2 Norwegian Mouse 2 Chicken 2 1966 Arlo Woody Guthrie (died at 55, in 1967) Sarah Huntington's disease is… …a genetic disease of the central nervous system that produces speech slurring, involuntary movements, & progressive dementia. It usually starts between the ages of 30 and 50, and causes death after about 20 years (usually of pneumonia, choking, or heart failure). Suicide is common. Between 100,000-250,000 Americans have it (or will when they get older). …one birth in every 10,000 has the disease It is a dominant mutation which is easily passed on because people don’t know they have it until later in life. There is no known cure. This disease is named after Dr. Huntington (Long Island) who first diagnosed himself with the disease in 1872. His father and grandfather both died of the disease. His distant relative (who first came to America in the 1630’s) did so after being persecuted in Europe for consorting with the “devil” and for practicing witchcraft. It was probably the Huntington’s Disease that caused people to conclude he was “possessed”. (At first, people thought Woody Guthrie was an alcoholic…. then schizophrenic….) ? ? The story of Nancy Wexler Nancy Wexler (Ph.D. in psychology) Her mother died of Huntington’s so she may have the disease herself. In 1979, she saw a film about a Venezuelan village where an excessive number of people had the disease…... She got federal grant to visit the village and interview the people. Once they found out she might have the disease too, they eventually grew to trust her (and answered her prying questions) She built a pedigree chart of 15,000 Venezuelans and collected blood from 3,500 of them. This took 13 years…. How did this disease originate in this little village in Venezuela? In the 1800’s a Portuguese sailor come to the village. Some rumored he was a drunkard because he always walked as if he was intoxicated. Eventually, he married a local woman and had numerous children. Later, he died of unknown causes...... But his gene for Huntington’s Disease still survives in this village today (seven generations later)…. Of his 5,000 direct relatives, 250 of them have Huntington’s Disease (that is 1 out of every 20). 1966 Arlo Woody Guthrie (died 1967) Sarah The gene was isolated in 1993. Chromosome 4 Dominant. The CAG repeats occur in the first exon. Normal = 6 - 35 repeats Diseased = 40 – 121 repeats What is affected by the mutated Huntington Protein? Granular and filamentous Brain Journal, 2004, Everett & Wood, pp. 2385-2405 And that’s why Positional Cloning is important, honey! Some Human Disease Genes identified by Positional Cloning: 1986 = Duchennes Muscular Dystrophy 1989 = Cystic Firbrosis 1990 = 4 more 1991 = Fragile X Syndrome & 3 others 1992 = Lowe Syndrome & 2 others 1993 = Huntingtons Disease & 11 more 1994 = BRCAI (Breast Cancer), Dwarfism & 11 others 1995 = Alzheimers II, BRCAII & 9 others 1996 = X-linked Myotubular Myopathy & 15 others 1997 = Deafness (DFNAI), Tuberous Sclerosis, Juvenile Glaucoma & 13 others 1998 = Congenital Night Blindness, Juvenile Parnkinsons Disease & 10 others How does the Huntington’s Disease gene actually cause disease? A “degenerative disease”.. ..it is 10-20 years before becoming fatal. Apparently, it is a mutation that causes the repetition of the sequence “CAG”. Whereas a healthy person has 20 or so repeats (CAGCAGCAGCAGCAG...) people who have this disease have from 39 to 125 CAG repeats in a row. The more CAG repeats they have, the earlier the disease shows up. # of CAG Repeats Median Age at Onset * 39 66 years 40 59 41 54 42 49 43 44 44 42 45 37 46 36 47 33 48 32 49 28 50 27 Why are these repeats so harmful? *Age by which 50% of individuals will be affected Standard Genetic Code T C T TTT Phe (F) TTC " TTA Leu (L) TTG " TCT Ser (S) TCC " TCA " TCG " TAT Tyr (Y) TAC " TAA Stop TAG Stop TGT Cys (C) TGC " TGA Stop TGG Trp (W) C CTT Leu (L) CTC " CTA " CTG " CCT Pro (P) CCC " CCA " CCG " CAT His (H) CAC " CAA Gln (Q) CAG " CGT Arg (R) CGC " CGA " CGG " ACT Thr (T) ACC " ACA " ACG " AAT Asn (N) AAC " AAA Lys (K) AAG " AGT Ser (S) AGC " AGA Arg (R) AGG " GCT Ala (A) GCC " GCA " GCG " GAT Asp (D) GAC " GAA Glu (E) GAG " GGT Gly (G) GGC " GGA " GGG " ATT Ile (I) ATC " A ATA " ATG Met (M) G GTT Val (V) GTC " GTA " GTG " A G CAG is the code for the amino acid Glutamine Loci is on Chromosome 4 This CAG gets repeated up to 125x “HAP-1” is a protein that occurs in brain cells. Its normal function (whatever that is) might be blocked by the Glutamines. This is why the mutation is Dominant. Even if an individual is Hh, they will have this faulty protein in their brain. It doesn’t matter if the other allele is normal. It took another 10 years to discover all of this.. So, maybe extra HAP-1 protein could be delivered to patients brains, or some other molecule could be added to stop the Glutamines from attacking it….there is alot of research going on right now. Basically, there are three molecular approaches that can be taken once a genetic disease is described at the biochemical level: 1) develop pharmaceuticals 2) gene therapy 3) early diagnosis What are the advantages of early diagnosis? There is a genetic test for Huntington’s Disease… It costs $1300. Only 3% of the people in the U.S. who are “at risk” actually take the test…. Why so few do you think? There are other “Triple Repeat” diseases…… These Mental genes are sometimes called Retardation “Stuttering” genes…they usually …X Chromosome A type of Muscular affect the neurologicalDystrophy.. system …Chromosome 19 What does all this have to do with DNA Replication? Stuttering genes, and other regions in the genome that have repeats, likely became that way because of mistakes during DNA replication DNA Replication A very quick look.. What does the “S” stand for ?? The “Klenow Fragment” of DNA Polymerase Hyperlink to molecular movies... (1&2 are best) See places where replication is taking place? The building block of a new DNA strand QUESTION: What is the function of the three phosphate groups? Energy....the last two phosphates break away when the phosphodiester bond is formed to the 3’ end of the adjacent nucleotide. QUESTION: If another nucleotide base was going to be added to this molecule, where would it be added? DNA Polymerase is a “stupid” enzyme…. It has to be told when & where to start doing “its thing”... “Primers” tell the enzyme where to begin DNA replication. “priming” before you paint “priming” a water pump ? The two new strands grow in the opposite directions RNA Primase DNA (and RNA) bases are always added at the 3’ end of the nucleic acid chain. Click Here for an animation Proteins involved in eukaryotic DNA replication: 1. Origin Replication Complex (ORC) = binds to DNA sequences that represent “initiation sites”....eukaryotes have lots of these initiation sites available to start replication. They allow the next two enzymes to do their thing. 2. Helicase = unwinds DNA where the ORC is, separating the two strands 3. Topoisomerase = prevents original DNA from getting tangled 4. RNA Primase = adds 11 RNA bases near the initiation site ...this tell DNA Polymerase where to start... Proteins involved in eukaryotic DNA replication (continued): 5. DNA Polymerase = a large complex of proteins that grab the appropriate nucleotide triphosphate (one that is complementary to the DNA strand and adds it to the 3’ end of the new strand. 6. RNAase = an enzyme that removes RNA primers after replication. When that’s been done DNA Polymerase fills in these gaps with the appropriate DNA sequence. 7. DNA Ligase = forms phosphodiester bonds between the DNA pieces that are not yet connected (called “nicks”). A nick is when the phosphodiester bond is broken on one strand but not the other. Do cells ever make mistakes in copying DNA? Absolutely. Remember that cells can replicate all of their DNA in hours or even minutes....so there are bound to be errors. There is an error in replication is estimated to occur about once every 0.1 to 1 billion bases. But since we have 6 billion bases per cell, that is between 6 to 60 mistakes per cell. …although most of them get corrected, the mistakes that persist represent “mutations”. For these mutations to be passed on to the next generation the mutation would have to be carried in the gametes Fertilized egg What can happen if ordinary somatic cells obtain new mutations of its DNA? These mutations can show up as: 1) Deletion of a base 2) Addition of a base 3) Repetition of bases (like the CAG Repeat that causes Huntington’s Disease) 4) Substitution of one base with another The environment can also induce mutations: 1) Ultraviolet Radiation (sunlight) 2) X-ray Radiation (medical) 3) Chemicals like tobacco smoke, asbestos, vinyl chloride, alcohol, mustard gas, creosote 4) Anything that generates free radicals (charged oxygen ions) 5) Really bad TV Shows, like “Bachelor”... Standard Genetic Code T C T TTT Phe (F) TTC " TTA Leu (L) TTG " TCT Ser (S) TCC " TCA " TCG " TAT Tyr (Y) TAC " TAA Stop TAG Stop TGT Cys (C) TGC " TGA Stop TGG Trp (W) C CTT Leu (L) CTC " CTA " CTG " CCT Pro (P) CCC " CCA " CCG " CAT His (H) CAC " CAA Gln (Q) CAG " CGT Arg (R) CGC " CGA " CGG " ACT Thr (T) ACC " ACA " ACG " AAT Asn (N) AAC " AAA Lys (K) AAG " AGT Ser (S) AGC " AGA Arg (R) AGG " GCT Ala (A) GCC " GCA " GCG " GAT Asp (D) GAC " GAA Glu (E) GAG " GGT Gly (G) GGC " GGA " GGG " ATT Ile (I) ATC " A ATA " ATG Met (M) G GTT Val (V) GTC " GTA " GTG " A G Remember that some mutations are going to be more important than others A change from GGT to GGC, for instance, would still yield a glycine Nature Genetics, 1993 So, Positional Cloning techniques were used to isolate a 1,200,000 bp piece of Chromosome #6. Less than 1% of this region actually codes for the SCA-1 transcript (mRNA). Globin gene Healthy Anemic Use a probe from for this region For Sickle Cell Anemia Probes are valuable for identifying the mutations in a well-characterized gene A and B are homologous chromosomes Not cut here You can think of “B” as “little a” Southern Blots (of genomic DNA) following digestion with EcoRI enzyme A and B are homologous chromosomes EcoRI cuts the “A” allele in half, and Probe 3 allows you to visualize that. Lets pretend the “A” allele is the diseased allele. Not cut here You can think of “B” as “little a” If you made a genomic library of a person with a RFLP-mapped disease, you could use Probe 3 to screen the library. The other two probes would work too, but be further away from mutation. that reveals RFLP Healthy The RE site for this disease must be here Diseased Diseased Healthy Diseased Healthy So, the hard part is finding the right combination of RE and probe….which is one reason why Postional Cloning is so slow and expensive. If this band is always present in people with the disease then the probe could be useful in screening a library. One way of finding the best probe is: “Chromosome Walking” If linkage (by studying pedigree analysis) can be shown for a disease (that is already cloned), then begin there, and “walk” to the gene of interest. Linked gene here Different library made with different RE Each time you make a new probe, use that to look for RFLPs in healthy vs. diseased people. Different library made with different RE Chromosome Walking If an RFLP can’t be found for the disease of interest (for instance, point mutations wouldn’t reveal themselves as RFLPs unless the single mutation was exactly on a RE site) you can look at transcription. mRNA can be isolated from healthy vs. sick people (using Poly-A chromatography) and then ran on a gel, transferred to a membrane, and probed just like a Southern Blot. NORTHERN BLOT If the disease of interest involves muscle tissue then this probe might be important… especially if it doesn’t occur in diseased people. Northern blot showing the presence of mRNA hybridizing to sadA cDNA in different types of tissue. 1, Dry seeds; 2, seeds after 16 h of soaking in tap water; 3, shoots 9 d after sowing; 4, cotyledons 14 d after sowing; 5, leaf buds 14 d after sowing; 6, cotyledons 21 d after sowing; 7, second leaf pairs 21 d after sowing; 8, third leaf pairs 21 d after sowing; 9, fourth leaf pairs 21 d after sowing; 10, fifth leaf pairs 21 d after sowing; 11, roots from plants grown in vermiculite 14 d after sowing; 12, roots from plants grown in vermiculite 21 d after sowing; 13, roots from plants grown in vermiculite 42 d after sowing; 14, stems 21 d after sowing; 15, tendrils; 16, flowers (white); 17, flowers (purplish); and 18, pods. Southern Blot showing “Anticipation” The length of a centiMorgan (in terms of DNA bases) is different for each species…… In Humans: 1 cM = 1 million DNA bases (on average)