Chromosomal Anatomy I. SIZE A. The haploid human genome contains about 3 billion base pairs that are divided up into 23 chromosomes 1. The human genome is thought to contain around 30 - 40,000 genes 2. The smallest chromosome contains around 50 million base pairs and the largers around 250 million base pairs B. DNA in chromosomes may be as long as 50 millimeters if relaxed, but often packed into as little as 0.005 millimeters (5 m) II.CENTROMERES A. Location and function 1. Thin region of chromosome 2. Attachment site for kinetochore microtubules a) Essential for proper partitioning during mitosis and meiosis B. Types of centromeres 1. Point centromeres a) Found in yeast b) 225 base pairs (1) Region I (a) Highly conserved 8 base pair region (2) Region II (a) AT-rich variable sequence (3) Region III (a) Highly conserved 26 base pair region (b) Mutations are least tolerated in this region 2. Regional centromere a) 19K to 100K base pairs b) Repeated sequences (1) In humans, a 170 base pair sequence may be repeated 5 - 6,000 times (a) The actual number of repeats is unique to specific chromosomes (2) These repeated sequences are referred to as satellite DNA (3) These sequences are not transcribed C. Centromere placement 1. Metacentric a) Chromosomes have the centromere located in the middle b) Two arms are about equal length 2. Submetacentric a) Chromosomes have centromeres near the middle b) Arms are similar, but not of equal length 3. Acrocentric a) Chromosomes have the centromere located near one end b) It has a set of long arms and a set of short arms 4. Telocentric a) Chromosomes have centromere at the end b) Only one arm c) Not present in humans III.TELOMERES A. Functions 1. Prevents end to end fusion of chromosomes a) DNA that is broken is usually ligated to linear ends; this is not the case with telomeres 2. Prevents degradation by exonucleases 3. Necessary for chromosomal replication B. Structure 1. TTAGGG repeated 250 - 1000 times 2. This sequence can form a hairpin terminus C. Problems replicated ends of a linear chromosome 1. DNA is replicated only 5' to 3' 2. G-rich strand ends up as single stranded a) 12 - 16 base overhang D. Telomerase 1. Enzyme with associated RNA molecule a) 3'-AACCCCACC-5' 2. Part of RNA hybridizes with 5'-3' strand and the rest serves as a template for DNA synthesis E. 3'-5' strand? 1. Hair pin loop of G may serve as primer for this strand F. Cancer 1. Cells won't replicate if telomeres are absent a) In lower eukaryotes, telomeres are constantly re-replicated (1) These cells are immortal; they can replicate indefinitely b) Telomerases are not active in most somatic cells (1) These cells replicate until the telomeres are too short c) Telomerase needs to be activated for cells to become cancerous and these cell become immortal IV.NOMENCLATURE A. Chromosome numbers 1. 1 - 22 a) Largest to smallest 2. X and Y B. Arms 1. p-arm a) Short arm 2. q-arm a) Long arm C. Region 1. Area between two major land-mark 2. Centromere and major band, or major band to telomere 3. Number from centromere out D. Bands 1. Several different banding patterns can be detected depending on what techniques is used E. Example 1. 7q31 a) Gene for cystic fibrosis is on chromosome 7, the long arm, in the third region, near band # 1 V. DNA SEQUENCES A. Unique DNA 1. Sequences present only once in the chromosome 2. These sequences usually represent genes that code for proteins a) Only 1 - 2% of the human genome codes for proteins 3. Introns are regions of DNA in genes that are transcribed, but removed from RNA before transcription a) These account for about 10% of the human genome B. Moderately repetitive DNA 1. Make up about 30% of the human genome 2. Short interspersed elements (SINEs) a) Interspersed elements, less than 500 base pairs (1) Found both between and within genes (2) Common SINE in humans is referred to as Alu repeats and is between 200 and 300 base pairs long (a) It is named such as it is cut by the restriction enzyme called Alu (b) It makes up about 10% of the human genome b) May be found more than a million times in the genome 3. Long interspersed elements (LINEs) a) Sequences 5 to 10,000 base pairs long b) The most common LINE in humans in designated LI (1) It is a 6,400 base pair sequence repeated 3,000 to 40,000 times 4. Variable number tandem repeats (VNTRs) a) Instead of being dispersed, VNTRs are repeated sequences located adjacent to one another b) 15 to 100 base pair sequences repeated 10 - 100 times (1) Persons inherit a certain number of repeats from their parents; the number of repeats can be used in DNA fingerprinting techniques C. Highly repetitive 1. 300 - 3 million copies a) 97% of crab DNA is ATATATAT b) Cows have a 1400 base pair repeat c) 5% of human genome is a 150 - 300 bp ALU repeat 2. This DNA is referred to as satellite DNA a) It is called satellite DNA since it forms a separate band (a satellite) from the rest of the DNA when subjected to CsCl-density ultracentrifugation (1) This separate band is due to different densities due to variation in the G + C content b) Examples (1) Cows have a 1400 base pair repeat (2) Monkeys have a 172 base pair repeat (3) Crabs have a two base pair repeat (ATATAT) c) Location (1) Satellite - probe hybridization study (a) Fix cells to slide (b) Alkaline denature (c) Add radioactive satellite DNA probe (d) Rinse (e) Autoradiograph (2) Localized in heterochromatin, near centromeres, and with smaller blocks at telomeres (a) There are different repeats and distribution for each chromosome D. Special base sequences 1. Palidromes a) A sequence that reads the same in the 5' to 3' direction on complimentary strands (1) Regions of dyad symmetry b) Example (1) ATCGCGAT c) May form hair pin (no spacer) or stem-loop structures (with spacers) 2. Direct repeats a) No effect on structure with or without spacers E. Multi-gene families 1. Definition a) Genes that share a high level of homology b) Probably arose from ancestral genes that were duplicated within the genome (1) These could then mutate and evolve separately 2. Example 1 -- rRNA genes a) Since the products of these genes are needed in large supplies, there are several copies of the genes located near each other 3. Example 2 -- The globin gene family a) An ancestral globin gene gave rise to myoglobin and various hemoglobins (1) Hemoglobin is a tetramer (composed of four subunit) b) The ancestral globin gene gave rise to alpha () and beta () types (1) The alpha gene gave rise to two zeta (), 1 and 2 genes (2) Beta gave rise to two gamma (fetal) an epsilon (embryonic) sigma (immature) and beta (adult) c) Types of hemoglobin (1) Mature adult RBCs contain 11 (2) Embryonic RBCs contain (3) Fetal RBCs contain 22 (a) Higher affinity for oxygen then adult hemoglobin (4) Immature RBC contain 11 F. Pseudogenes 1. Some multigene families contain pseudogenes a) A DNA sequence similar to the gene families, but with enough mutations that it is no longer expressed b) These genes are designated with a prefix of psi () 2. Examples a) Both the alpha and beta globin gene families have non-functional pseudogenes (1) 1 would represent a pseudogene most closely related to the 1 gene