Genetics and diagnostics: the potential, the choices .. 5 10 15 20 25 30 35 40 45 50 55 Introduction The sequencing of the human genome provides the route not only for an improved understanding of our own biology but also the basis for a quantum jump in medical science. By combining genetics and medicine in new genetic diagnostic tools, physicians will be able to predict or anticipate, and more importantly, tailor molecular treatments to the genome of the patient. Improvements in diagnosis in timing and accuracy should equate to improved prognoses and a better quality of life with cost savings in treatment. This exciting prospect cannot however, be introduced without (1) a concord between scientists and physicians who understand the technology and the wider community; (2) a legislative framework in which genetic ‘risk’ is managed to protect all members of society equally. In this seminar, I welcome this opportunity to review the current level of understanding and examine the prospects for the future in the area of gene diagnostics. DNA – the scientific and cultural icon The elucidation of the structure of deoxyribonucleic acid (DNA) by Watson, Crick, Wilkins and Franklin at Cambridge and London Universities revolutionised biological science and entered mainstream culture as motif. At a genetic level, the double helix form provides a clear mechanism by which genetic information can be encoded, copied and varied through the generations. It explains how ‘like begets like’. A single DNA molecule forms each individual chromosome. Each species has a characteristic number of chromosomes, which are present in the nucleus of all dividing cells. The chromosomes are very large molecular structures and are visible under the light microscope. They can be seen doubling and separating into the daughter nuclei at each cell division. The human genome is comprised of 23 pairs of chromosomes, 22 autosomal pairs and the sex chromosomes. Each pair is made up of a maternally and paternally inherited chromosome. The sex chromosomes determine sex; females carry two X chromosomes, males carry an XY. The sex of the child is normally determined by the father. Sperm and Ova carry one of each autosome pair and one sex chromosome (either an X or Y). Fusion of sperm and ovum unite the single chromosomes into pairs to form the genome of the foetus. Thus the child may ‘have its mother’s eyes and father’s nose’. The DNA helix and the genetic code Our genetic information is contained within the order of the bases holding the two DNA strands forming the helix together. In essence, the DNA helix resembles a twisted ladder. DNA contains four bases, A, T, G and C, which when paired A=T or G=C form the rungs of the ladder. Because of chemical constraints, A only binds T, and G only binds C. Thus where one half rung is an A, the other complementary half rung will always be a T and so on. And mutation…. The ability to incorporate change is an important feature of heredity. Changes in genetic information generate genetic variation upon which Natural Selection can operate to drive evolution. During cell division, DNA replication, the process of copying DNA molecules ensures that the original DNA helices are faithfully duplicated. The DNA strands are unwound and each parental strand is used as a template in the synthesis of a complementary strand. The new and old strands are then reformed into a tightly wound helix. Although the replication process has high fidelity, errors do occur at very low frequencies (1 in 10 million). These errors form mutations, or changes relative to the original DNA sequence. Proteins - all biology is shape Living cells are based on proteins. These are long, linear chains of amino acids. All proteins are made from just 20 different amino acid types. Proteins form the cell structure, the cell enzymes and the cellular control systems (signalling molecules, hormones, gene-regulatory factors). All of these proteins are highly specific three dimensional shapes. Each individual amino acid of the 20 different amino acids exhibits a unique, complex three dimensional shape and a unique reaction to water. Some amino acids are water loving, others water hating. The shape of a protein is determined by the unique sequence and identity of the amino acids in the chain. The chain of amino acids, each with its own shape and reaction to water interact to produce the specific protein shape. The same amino acid sequence will always form the same three dimensional shape. It is this order of amino acids that is controlled and encoded by the sequence of bases on the DNA helix. A gene, a defined region of the chromosome, is a unique sequence of bases which encodes the order of amino acids for its protein product. Each protein in the human body is encoded by one gene. 1 Genetics and diagnostics: the potential, the choices .. 5 10 15 20 25 30 35 40 45 50 55 The DNA code The DNA code is formally similar to Morse code. This is comprised of dots (.) and dashes (-). Using different combinations of dots and dashes, all 26 letters of the alphabet can be encoded in Morse. The genetic code is based on triplets of bases. Since there are four bases (A,T,G,C), the total number of unique triplet combinations is 4 cubed which equals 64 different code words. Since there are only 20 amino acids in proteins, and 64 different words, genetics has synonyms and punctuation. Synonyms and the nature of the genetic code operate to reduce the severity of DNA sequence changes introduced by mutation. These features mean that most mutations are subtle changes. Genes are marked out by rudimentary punctuation. The gene start is signalled inter alia by ATG or GTG triplets and all genes terminate at stop signals which can be TAA, TAG, or, TGA. DNA Regulatory Elements The expression of genes, the process by which the information within the DNA base sequence is transcribed and translated into a linear chain of amino acids in a given protein, is controlled DNA regulatory elements. These are special sequences of DNA upstream of the gene at which the cellular machinery acts to turn on or turn off expression of the gene. Changing the DNA sequence of a these regulatory elements can have widespread effects on the level, timing and synchronisation of expression. The Human Genome Project The determination of the 3.2 billion base pair sequence of all 23 chromosomes within the human genome is an outstanding scientific achievement. It was initiated by speculative governmental support, greatly accelerated through international collaboration between publicly funded and private laboratories. Several features of this project are unique and deserve mention. The collaboration involved 20 research groups around the world from 6 countries (USA, UK, Japan, France, Germany and China). Although this approach introduced increased overhead costs, from a societal aspect, sharing the work without reference to national interests was essential. The human genome sequence data is available to all, free and without restriction. Sequence data were released onto the world wide web within 24 hours of its determination. This data stream has stimulated intense scientific effort throughout the world. The draft genome sequence covering in excess of 94% of the genome was published 26 July 2000. A final 99.9% complete sequence will be completed well before the predicted end date of 2003. Details, background and sequence data can be obtained from any one of several web sites (www.sanger.ac.uk). Donor Procedure The sequence data is a composite of an unspecified number of individuals. Donors were anonymous, biological samples (blood and sperm) were obtained in accordance with US Federal Regulations for the Protection of Human Subjects in Research (45CFR46). Advertisements placed in the locale of the two sampling laboratories generated a volunteer stream. Donors with diverse backgrounds were taken on a first come first served basis. Samples were obtained after discussion with a genetic counsellor and written informed consent. Around 5 to10 samples were collected for every one that was used. The identity of the donors for the libraries is not known, even by the donors. (http://www.nhgri.nih.gov/Grant_info/Funding/Statements/RFA/human_subjects.html). How many genes make a human? Analysis of the draft human genome sequence indicates that only 5% of the 3.2 billion base pairs of our DNA actually encodes genes. By far the larger proportion, approximately 50% is made up of repetitive DNA, so-called ‘junk’ DNA. The challenge is to recognize the gene from the junk so as to build a complete list of all human genes and their products for genetic and medical research. The sequence allows a direct calculation of the total number of genes to be made. Since the discovery of the DNA helix, the number of genes required for the human genome has fascinated biologists. There have been several estimations: (1) reassociation kinetics estimated the mRNA complexity of typical vertebrate tissues to be 10,000 to 20,000, which was extrapolated to suggest around 40,000 for the entire human genome (2) Walter Gilbert in a back of an envelope calculation (which rather embarrassingly stuck!) suggested that there might be about 100,000 genes, based on the approximate ratio of the size of a typical gene to the size of the genome; (3) An estimate of 70,000-80,000 genes was made by extrapolating from the number of CpG islands and the frequency of their association with known genes; (4) Estimates based on ESTs range from 35,000 to120,000 genes. Rigorous calculations consistently produce 2 Genetics and diagnostics: the potential, the choices .. low estimates, in the region of 35,000. Direct analysis of the draft genome sequence suggest a figure around 31,000. 5 10 15 20 25 30 35 40 45 50 55 Disease implications of 31,000 human genes Gene number estimations in the 100,000 range easily accommodate the ‘gene for’ syndrome. This alarming determinist view suggests that we are merely the product of our genes. Thankfully such extreme views are not supported by the draft sequence data. However, with respect to disease, and genetic diagnostics, a small number of human genes implies that there is a large level of genetic interaction. Thus a particular disease precondition is the property of a large number of particular gene sequences interacting together. Thus more extensive analysis would be required to establish a diagnosis but this is technically feasible. Human genetic diseases Just as we accept that there are some diseases that result from a particular version of the disease gene (the good and bad allele: Examples such as Huntingdon’s Disease, Muscular Distrophy, Cystic Fibrosis, Haemophilia, Sickle Cell Anaemia for which gene probe tests already exist or may be predicted from a family pedegree), so combinations of genes or DNA sequences may produce a genetic predisposition to a particular disease or adverse drug reaction. Equally important, given that large numbers of genes interact to produce a particular predisposition, it is equally true that no single individual is likely to be genetically perfect. In fact it is statistically very unlikely. Therefore, every individual will encode some deleterious genes in his or her genome. SNPs describe the individual Single nucleotide polymorphisms describe single base sequence changes at a given position between different chromosomes. In disease genes such as those mentioned above, the good and bad gene versions are differ only in their sequence at one position = SNP. The critical thing about SNPs is that they are subtle genetic changes which when combined account for our individuality. As part of the human genome project, regions of the known sequence were resequenced using other random donors to establish the level of sequence heterogeneity. These studies identified in excess of 1.42 million SNPs, equating to 1 SNP per 1300 bp. SNPs as a medical and diagnostic tool The basis for understanding the correlation between disease and the genome is to show that a particular gene or chromosomal region is associated with a disease condition or eventually precondition. The SNP map provides an average of 15 SNPs for gene loci of average size. Most practically, where a gene has been implicated in causing disease (by chromosomal position relative to linkage peaks, known biological function or expression pattern) analysis of the SNP profile of the region can lead to predictive indicators. Using the SNP map, it should be possible to evaluate the extent to which common haplotypes contribute to disease risk. As the speed and efficiency of SNP genotyping increases, such studies will fuel increasingly comprehensive tests of the hypothesis that common variants contribute significantly to the risk of common diseases. SNPs maps correlated to disease and drug response profiles could provide physicians with better treatment regimens based on molecular diagnosis and genome-specific drug treatments. Examples of usefulness of SNPs, ß adrenergic receptor. 483 drug targets for human disease are known. HGP allows us to clone disease genes which will generate more targets, each of which could be a diagnostic. ß adrenergic receptor SNP profile has been related to the drug response in heart patients. Prospects and Challenges SNP based genetic testing is in its early days (detection of drug response predetermining factor, some single gene diseases). But the rate of data acquisition is phenomenal. It is likely that for common disease conditions, for which a large patient cohort is available, SNP profiles will be forthcoming in the very near future which will revolutionise some treatments in a manner analogous to DNA fingerprinting in criminal investigations. Physicians could provide more effective treatment of disease, and provide firm preventative advice where they felt it appropriate. This could improve the quality of life of the nation as well as release resources spent on less timely diagnoses/treatments. 3 Genetics and diagnostics: the potential, the choices .. 5 To a large extent, the technical challenges to the construction of a SNP map are minimal. However, the social concerns are significant. It is important to remind ourselves that we all carry deleterious genes. As SNP based diagnoses are brought on line, some diseases, for which a test has been made, will gain in notoriety. Eventually, most disease conditions could in theory have a predictive SNP profile. Under these conditions, discrimination based on gene should be reduced. For the short to medium term, therefore, it is imperative that legislation protects individuals from genetic discrimination whilst maintaining support for effective gene diagnostic tools which can improve health care cost-benefit. 4