Molecular Biology (Biochemistry of the Gene) The Structural Basis of Cellular Information: DNA, Chromosomes, and the Nucleus Cells possess a set of “instructions” that specify their structure, dictate their functions, and regulate their activities These instructions can be passed on faithfully to daughter cells Hereditary information is transmitted in the form of distinct units called genes Genes consist of DNA Genes consist of DNA that codes for functional products that are usually protein chains The information in a cell’s DNA molecules undergoes replication to generat e two copies, for distribution into each daughter cell Figure 18-1A Transcription and translation Instructions stored in DNA are transmitted in a two-stage process, called transcription and translation Transcription: RNA is synthesized in an enzymatic reaction that copies information from DNA Translation: the base sequence of RNA is used to direct the synthesis of a polypeptide Figure 18-1B Chemical Nature of the Genetic Material Walther Flemming first observed chromosomes under the microscope Just a few years before, Miescher reported the discovery of the substance now known as DNA Miescher’s Discovery of DNA Led to Conflicting Prop osals Concerning the Chemical Nature of Genes Miescher extracted a material from white blood cells that he called “nuclei n,” now called DNA He also extracted it from salmon sperm and believed it to be involved in heredity However, because of inaccuracies in measurement of DNA in eggs and sperm, he changed his mind DNA and chromosomes In the early 1880s a botanist named Eduard Zacharias found that removing DNA from cells abolished the staining of chromosomes He and others began to infer that DNA is the genetic material In the early 1900s, incorrectly interpreted staining experiments led to the false conclusion that amounts of DNA in cells change dramatically Genes and protein From 1910 to the 1940s most scientists believed that genes were made of protein rather than DNA Proteins were thought to be more complex than DNA and thus more likely to be the genetic material This idea prevailed until two important lines of evidence confirmed that DNA is the genetic material Avery Showed That DNA Is the Genetic Material of Bacteria F. Griffith, studying a pathogenic bacterial strain that caused pneumonia in animals, found two forms of the bacterium S-strain caused a fatal infection when introduced into mice R-strain was unable to do so Figure 18-2A,B Genetic transformation When dead S-strain and living R-strain were mixed and used to infect mice, the mice died Griffith found many live S-strain bacteria in the dead mice He concluded that the R-strain had been converted into S-strain, a process called genetic transformation Figure 18-2C, D, E Avery and colleagues identified the transforming substance Oswald Avery and colleagues followed up the experiments of Griffiths by trying to determine what the transforming substance was They fractionated extracts of the S-strain bacteria and found that only the nucleic acid fraction was able to transform the R-strain Digesting the DNA from the extract prevented transformation Oswald Avery pursued the investigation of bacterial transformation by asking which component of the heat-killed S bacteria was actually responsible for the transforming activity The fractionated cell-free fraction DNA Only the nucleic acid fraction Causes transformation Protein We prove to be right !!! HAHAHAHAHAHA ㅎㅎㅎㅎㅎㅎ ㅋㅋㅋㅋㅋㅋ ^^ ^^ ^^ ^^ ^^ ^^ ^^ It is lots of fun to blow bubbles but it is wiser to prick them yourself before someone else tries to !!!! Life is not fair !!! Sometimes luck is more important than your real ability Hershey and Chase Showed That DNA Is the Genetic Material of Viruses Bacteriophages (or just phages) are viruses that infect bacteria Phage T2, which infects E. coli, is one of the best studied During infection the virus attaches to the bacterial cell surface and injects material into the cell Day Hershey and Martha Chase Figure 18A-1 Figure 18A-2B The genetic material of T2 phage Soon after infection, the bacterial cell begins to produce thousands of new copies of the virus Hershey and Chase labeled phage proteins with radioactive sulfur, 35S, and the DNA with radioactive phosphorus, 32P, and allowed the phages to infect the bacteria In this way they could trace the fate of proteins and DNA during infection The experiment After infection, once the material is injected into the bacteria, the empty phage protein coats (ghosts) were removed by agitating cells in a blender Cells were recovered by centrifugation They then measured the radioactivity in the supernatant and the cells at the bottom of the tube Figure 18-3A The results The results showed that most of the 32P remained with the bacterial cells, but the majority of the 35S was found in the surrounding medium Hershey and Chase concluded that DNA and not protein had been injected into the bacterial cells Therefore, DNA was the genetic material of the phage Figure 18-3B Chargaff’s Rules Reveal That A=T and G=C Chargaff’s Rules Reveal That A = T and G = C Erwin Chargaff was interested in the base composition of DNA, and used chromatographic methods to separate and quantify the relative amounts of the four bases He showed that the DNA from different cells of a given species has the same percentage of each of the four bases The base composition varies among species Chargaff’s most striking observation Chargaff observed that for all DNA samples examined the number of A = the number of T and the number of G = the number of C These are called Chargaff’s rules The significance was not understood until Watson and Crick proposed the double-helix model for DNA structure Table 18-1 My GC content is 40 % Mine too DNA Structure Once it was determined that DNA was the genetic material, a new set of questions began to emerge One of the first was how cells are able to accurately replicate their DNA to be passed on during cell division Answering this question required an understanding of the 3-D structure of DNA Watson and Crick Discovered That DNA Is a Double Helix Watson and Crick built wire models to try to determine the structure of DNA that agreed with everything known about DNA It was known that DNA had a sugar phosphate backbone with nitrogenous bases attached to each sugar It was known that at physiological pH, the bases would be able to form hydrogen bonds with each other The double helix model The critical evidence came from X-ray diffraction data produced by Rosalind Franklin It revealed that DNA was a long thin helical molecule Based on this information and other observations Watson and Crick produced the double helix model Watson and Crick Discovered that DNA Is a Double Helix After 40 Years If it were not for my X-ray diffraction photograph, James and Francis would never discover the structure of DNA. I did not know that the other two men were using my own experimental findings. But I did not complain Rosalind Franklin (1919 – 1956) A victim of ovarian cancer The double helix model (continued) In the double helix, the sugar-phosphate backbones are on the outside of the helix with the bases on the inside, forming “steps” in a “spiral staircase” There are 10 nucleotide pairs per complete turn, and 0.34 nm per nucleotide pair The 2-nm diameter of the helix is too small for purines and too large for pyrimidines, but just right for one of each The double helix model (continued) The purine-pyrimidine pairing is consistent with Chargaff’s rules The two strands are held together by hydrogen bonding between bases on opposite strands The hydrogen bonds only fit within the helix when they form between complementary bases: adenine with thymine and guanine with cytosine Figure 18-4A Figure 18-4B Replication of genetic information The most important aspect of the double helix model was that it suggested a mechanism for replication of DNA The two strands could separate so that each could act as a template to dictate synthesis of a new complementary strand Antiparallel strands The phosphodiester bonds that join the 5′ carbon of one nucleotide to the 3′ carbon of the next are oriented in opposite directions in the two DNA strands This is called antiparallel orientation and has important implications for both replication and transcription Types of helix The right-handed helix is called B-DNA Naturally occurring B-DNA helices are flexible with variable shapes and dimensions, depending on nucleotide sequence; it is the main form of DNA Z-DNA is a left-handed helix; its biological significance is not wellunderstood Z-DNA is a lefthanded double helix. Name Z was derived from the zig zag pattern of its sugarphosphate backbone and it’s longer and thinner than BDNA You can find me where purines are alternating or pyrimidines are alternating or cytosine with methyl groups I’m right handed. Shorter and thicker than B DNA. You can create me artificially by dehydrating B DNA A-DNA (left), B-DNA (middle) and Z-DNA (right) -- 12 bp each DNA Can Be Interconverted Between Relaxed and Supercoiled Forms The DNA double helix can be twisted upon itself to form supercoiled DNA Positive supercoil: the DNA is twisted farther in the same direction as the helix Negative supercoil: the DNA is twisted in the opposite direction as the helix Relaxed: no twisting of the DNA Figure 18-6 (Top) Figure 18-6 (Bottom) Supercoiling and gene regulation Supercoiling affects both spatial organization and energy state of DNA, and so affects the ability of the DNA to interact with other molecules Tighter winding of the double helix reduces the chances of interaction Negative supercoiling increases access to proteins involved in replication or transcription Circular DNA molecules found in nature, including those of bacteria, viruses and eukaryotic organelles are invariably negatively supercoiled. Positive supercoiling favors tighter winding of the double helix Negative supercoling tends to unwind the double helix You can enter inside my pants !! Nothing can enter inside my pants !!! Positive Supercolied !! Negative Supercolied !! Interconversion between relaxed and supercoiled DNA Topoisomerases can both induce and relax supercoils Type I topoisomerases: introduce transient single-strand breaks in DNA Type II topoisomerases: introduce double-stranded breaks; one example in bacteria is DNA gyrase Figure 18-7A Figure 18-7B The Two Strands of a DNA Double Helix Can Be Separated Experimentally by Denaturation and Rejoined by Renaturation DNA strands are bound together by relatively weak noncovalent bonds Strand separation (denaturation) can be induced experimentally by raising temperature or pH The process can be monitored because single- and double-stranded DNA differ in light absorption Denaturation All DNA absorbs light with a maximum around 260 nm As the strands separate the absorbance increases rapidly The temperature at which half of the absorbance change is reached is called the DNA melting temperature Tm; the Tm varies depending on A-T content Figure 18-8 The temperature at which half of the max absorbance has been achieved Melting points reflects how tightly the DNA double helix is held together GC base pairs held by 3 hydrogen bonds (G≡C) AT base pairs held by 2 hydrogen bonds (A=T) So GC pairs are more resistant to separation than AT pairs The melting points therefore increase proportinal to the number of GC pairs Tm ∝ GC content Figure 18-9 Renaturation Denatured DNA can be renatured by lowering the temperature to permit hydrogen bonds to reform In nucleic acid hybridization, nucleic acids can be identified based on sequence Denatured DNA is incubated with a purified single-stranded DNA (a probe) with a sequence complementary to the sequence one is trying to detect Figure 18-10 The Organization of DNA in Genomes The genome of an organism consists of the DNA that contains one complete copy of all the genetic information of that organism A haploid set of chromosomes consists of one representative of each type of chromosome whereas a diploid set contains two of each Genome Size Generally Increases with an Organism’s Complexity Genome size is usually expressed in base pairs (bps) Kilobases (Kb, thousands), Megabases (Mb, millions), Gigabases (Gb; billions) are used as abbreviations There is a wide range of genome sizes among organisms, which generally increases with organism complexity Figure 18-11 Restriction Endonucleases Cleave DNA Molecules at Specific Sites Restriction endonucleases (restriction enzymes) cut DNA molecules at specific internal sites The resulting DNA pieces are called restriction fragments The specific recognition sequence is called a restriction site Figure 18B-1 Separation of Restriction Fragments by Gel Electrophoresis A mixture of restriction fragments can be separated by gel electrophoresis DNA molecules are negatively charged and so will migrate toward the anode Small DNA molecules are separated in polyacrylamide gels; larger fragments in agarose gels Figure 18-12 The submarine gel Gel eletrophoresis. In a gel (either agarose or polyacrylamide), the negatively charged DNA fragments move toward the positive electrode at a rate inversely proportional to their length. After the electric field is applied for a certain period, DNA fragments with different lengths will be separated, which can be visualized by autoradiography or by treatment with a fluorescent dye (e.g., ethidium bromide). The relationship between the size of a DNA fragment and the distance it migrates in the gel is logarithmic. Therefore, from the band positions, the lengths of DNA fragments can be determined. You should stain me before you see me !!! Detection of DNA in gels DNA can be detected with ethidium bromide which binds DNA and fluoresces orange under UV light If DNA fragments have been radioactively labeled, they can be detected by autoradiography, using X-ray film to yield an autoradiogram You should wear gloves !!! Restriction Mapping A researcher determines the order of restriction fragments of a DNA molecule by treating the DNA with restriction enzymes followed by gel electrophoresis DNA is cut with one restriction enzyme and with combinations of enzymes to get the overall restriction map Figure 18-13 Rapid Procedures Exist for DNA Sequencing Two methods were devised for DNA sequencing, around the same time restriction digesting was developed DNA sequencing: determining the linear order of bases in DNA The two methods were devised by Maxam and Gilbert, and Sanger and colleagues Sequencing methods The Maxam and Gilbert sequencing uses a chemical method, based on use of nonprotein chemicals that cleave DNA preferentially at certain bases The Sanger procedure, or chain termination method, uses dideoxynucleotides that when incorporated into a DNA chain, interfere with further synthesis of DNA Sanger sequencing has been adapted for use in automated machines A single-stranded DNA template is used as a template to guide synthesis of complementary DNA DNA synthesis is carried out in the presence of the deoxynucleotides, dATP, dCTP, dGTP, dTTP These are the normal nucleotides incorporated into growing DNA chains Sanger sequencing Small amounts of four dideoxynucleotides are added to the synthesis reaction (ddATP, ddCTP, ddGTP, ddTTP) (1) Each of these is labeled with a different fluorescent dye When a ddNTP is incorporated into the growing chain, synthesis ceases and the DNA strand is labeled with the fluorescent tag of the ddNTP When a dideoxynucleotide is incorporated into a growing DNA chain in place of the normal deoxynucleotide, DNA synthesis is prematurely halted because the absence of 3’ hydroxyl group makes it impossible to form a bond with the next nucleotide Sanger sequencing—the result A mixture of DNA strands is produced by Sanger sequencing, each with a fluorescent label attached that corresponds to the incorporated ddNTP (2) The sample is run on a polyacrylamide gel, which separates the DNA strands based on size (3) Figure 18-14 Sanger sequencing (continued) The fragments run through the gel with the smaller pieces running faster As they move through the gel, a special camera detects the color of each fragment as it moves past (4) In the automated machines this information is collected for hundreds of bases in a row, and fed into a computer The Genomes of Many Organisms Have Been Sequenced DNA sequencing is now so commonplace and rapid that it is routinely applied to entire genomes Computers are used to assemble sequences of short fragments into longer stretches millions of bases in length Many organisms have had their genomes sequenced The Genomes of Numerous Organisms Have Been Sequenced Although DNA sequencing machines can only determine the sequence of short pieces of DNA, usually 500 to 800 bases long, one at a time, computer programs search for overlapping sequences between such fragments and thereby allow data from hundreds or thousands of DNA pieces to be assembled into longer stretches that can reach millions of bases in length Table 18-2 Completed in 2003, the Human Genome Project (HGP) was a 13-year project coordinated by the U.S. Department of Energy and the National Institutes of Health. During the early years of the HGP, the Wellcome Trust (U.K.) became a major partner; additional contributions came from Japan, France, Germany, China, and others. Project goals were to: identify all the approximately 20,000-25,000 genes in human DNA, determine the sequences of the 3 billion chemical base pairs that make up human DNA, store this information in databases, improve tools for data analysis, transfer related technologies to the private sector, and address the ethical, legal, and social issues (ELSI) that may arise from the project. The New Field of Bioinformatics Has Emerged to Decipher Genomes and Proteomes Unraveling the sequence of bases was the ‘easy part’ Now comes the hard part of figuring out the meaning of this sequence of 3 billion A’s, G’s, C’s and T’s. Which stretches of DNA correspond to genes, when and in what tissues are these genes expressed, what kind of proteins do they code for ? How all these proteins interact with each other and function ?????? The Field of Bioinformatics Has Emerged to Decipher Genomes, Transcriptomes, and Proteomes Unraveling the DNA sequence of a genome is easier than determining What parts of the DNA correspond to genes What kinds of proteins the genes encode How the genes interact with each other How they function The field of bioinformatics merges computer science with biology to address questions such as these Humans may have twice number of genes as do worms or flies.. + = Junk DNA ??? Computer analyses also revealed that only about 1 – 2 % of the human genome actually represents coding sequences. Although the remaining DNA contains some important regulatory elements, most of it appears to consist of “junk” DNA with no apparent function maybe not, see the mini RNA that recently comes into focus Transcriptomes The DNA sequence of an organism’s genome provides only partial understanding of the functions of the genes A transcriptome is the entire set of RNA molecules produced by a genome DNA microarray technology facilitates the study of transcriptomes Proteome The function of most genes is to produce proteins, scientist are beginning to look beyond the genome to study the ‘proteome’the structure and properties of every protein produced by a genome Mass spectometry Mass spectrometry is a high speed sensitive technique that separates proteins or fragments based on differences in mass and charges Peptides can be produced by digesting proteins with specific proteases The peptides can be analyzed by mass spectrometry to identify them Mass Spectrometry boosted proteome research Protein microarray It is possible to immobilize thousands of different proteins in tiny spots on a small piece of glass The resulting protein microarrays can be used to study a variety of protein properties, such as ability to bind other molecules Analysis of large amount of data The huge amount of new data produced by these techniques must be analyzed by computers One of the most widely used tools is BLAST (Basic Local Alignment Search Tool),software that searches databases to locate DNA or protein sequences that are similar to other known sequences Tiny Differences in Genome Sequence Distinguish People from One Another On average about 99.7% of the bases in one person’s genome will match the published sequence of the human genome The remaining 0.3% of bases will vary from person to person, creating features that make us unique individuals Single nucleotide polymorphisms (SNPs) are single-base differences between individuals Single nucleotide polymorphisms Most SNPs are not located in protein coding parts of genes It is not necessary to examine all 20 million SNPs individually because they are not independent of one another SNPs close to each other on the same chromosome tend to be inherited together in blocks called haplotypes SNPs and the HapMap A database of haplotypes, the HapMap, provides a shortcut for scientists trying to make a connection between a disease and certain genes. Once a trait has been linked to a haplotype, only the SNPs associated with it are studied to determine which one is responsible DNA rearrangements also contribute to genome variability Copy number variants There are DNA segments thousands of bases long present in variable numbers of copies among individuals Each person’s genome is thought to contain hundreds of these copy number variations (CNVs) These may involve millions of bases, overall Repeated DNA Sequences Partially Explain the Large Size of Eukaryotic Genomes In the 1960s Britten and Kohne discovered repeat DNA sequences They broke DNA into small fragments, denatured them by heating, and allowed them to renature The rate of renaturation depends on the concentration of each kind of DNA sequence—those found in high concentration reanneal more quickly Bacterial vs. mammalian DNA When mammalian and bacterial DNA were tested, it was expected that bacterial DNA, having fewer types of DNA sequences, should reanneal much faster The results were not as expected; the calf DNA consisted of two classes of sequences that renature at very different rates About 40% of the calf DNA renatures more rapidly than bacterial DNA Repeated DNA sequences The more rapidly annealing sequences of the calf DNA contain repeated DNA sequences that are present in multiple copies Eukaryotes have variable amounts of repeated DNA in their genomes; the rest is nonrepeated DNA There are two categories of repeated DNA: tandemly repeated DNA and interspersed repeated DNA Figure 18-15 Complexity of chromosomal DNA DNA reassociation (renaturation) Double-stranded DNA Denatured, single-stranded DNA k2 Slower, rate-limiting, second-order process of finding complementary sequences to nucleate base-pairing Faster, zippering reaction to form long molecules of doublestranded DNA Cot The parameter controlling the reassociation reaction is: DNA concentration (Co) Time of Incubation (t) Cot represents Co x t Cot1/2 the concentration and time required to proceed half association The greater Cot1/2 slower reaction Why does concentration matter ? (short time) (loooooooooo~~ong time) Why does complexity matter ? (short time) (loooooooooo~~ong time) DNA complexity ∝ Renaturation time Renaturation time ∝(DNA concentration)-1 DNA complexity is a function of DNA concentration (C0) and time (t) DNA reassociation kinetics (for a single DNA species) Cot1/2 = 1 / k2 k2 = second-order rate constant Co = DNA concentration t1/2 = time for half reaction % DNA reassociated log Cot 0 50 100 Cot1/2 Ideal second-order DNA reassociation curve (Cot curve) Complexity expressed as base-pairs (bp) 100 1 101 102 103 2 104 105 106 3 107 4 108 109 1010 5 Cot1/2 10-6 10-5 10-4 10-3 10-2 10-1 100 Cot 1 = poly(dT)-poly(dA) 2 = purified human satellite DNA 3 = T4 bacteriophage DNA 4 = E. coli genomic DNA 5 = purified human single-copy DNA 101 102 103 104 There is a direct relationship between Cot1/2 and complexity DNA reassociation kinetics for a mixture of DNA species % DNA reassociated Cot1/2 = 1 / k2 k2 = second-order rate constant Co = DNA concentration t1/2 = time for half reaction 0 50 100 fast (repeated) intermediate (repeated) Cot1/2 Cot1/2 slow (single-copy) Cot1/2 I I I Kinetic fractions: fast intermediate slow I I I log Cot I I human genomic DNA I Tandemly Repeated DNA • One major category of DNA repeats is called tandemly repeated DNA • The multiple copies are arranged next to each other in a row • It accounts for 10–15% of a typical mammalian genome; a repeat unit can measure anywhere from 1 to 2000 bp, most of the time less than 10 bases • Tandemly repeated DNA was originally called ‘satellite DNA’ (because its distinctive base composition causes it to appear in a satellite band that separates from the rest of the genomic DNA during centrifugation process) Because AT base pair and GC base pair differ slightly in their molecular weight Simple-sequence repeats The tandem repeats that are less than 10 bases per repeat comprise a subcategory called simple-sequence repeated DNA There can be as many as several hundred thousand copies at selected sites in the genome It was originally called satellite DNA Functions of simple-sequence repeats Such sequences are not usually transcribed, so they may be responsible for imposing special physical properties on regions of the chromosome In eukaryotes, chromosomal regions called centromeres, with a role in chromosome segregation, are rich in simple-sequence repeats Telomeres, at the ends of chromosomes, also contain simple-sequence repeats Figure 18-3, Part I Minisatellites The amount of satellite DNA at any given site can vary enormously; typically it ranges from 105 to 107 bp in overall length The term minisatellite is used to describe shorter regions, between 102 and 105 in length Microsatellites are even shorter, 10–100 bp in length, but with numerous sites in the genome DNA fingerprinting Microsatellite and minisatellite DNA is useful for DNA fingerprinting It uses gel electrophoresis to compare DNA fragments derived from different genome regions It is a way to identify individuals Figure 18C-1, Steps 1–3 Figure 18C-1, Steps 4, 5 Triplet repeat diseases Some diseases are traceable to excessive numbers of triplet repeats (triplet repeat amplification), such as Huntington’s disease Other such diseases include fragile X syndrome, and myotonic dystrophy Interspersed Repeated DNA Interspersed repeated DNAs are scattered around the genome Single repeats are hundreds or thousands of bases in length and the dispersed copies, numbering in hundreds of thousands of copies, are similar but not identical to one another They account for 25–50% of mammalian genomes Types of interspersed repeated DNA Most interspersed repeated DNA consists of families of transposable elements (transposons), which can move around the genome and leave copies of themselves behind Roughly half of the human genome consists of these mobile elements The most abundant are called LINEs (Long interspersed nuclear elements) LINEs and SINEs LINEs are 6000–8000 bp long and contain genes required for their own mobilization SINEs are short interspersed nuclear elements and are less than 500 bp These rely on enzymes from other elements for their movement; the most common SINEs in humans are Alu sequences, which account for 10% of the human genome Figure 18-16 DNA Packaging Very long molecules of DNA must be fit into the cell and in the case of eukaryotes, into the nucleus DNA packaging is a challenge for all forms of life Bacteria Package DNA in Bacterial Chromosomes and Plasmids Bacterial chromosomes were once thought to be naked DNA However, it is now known that the DNA is packaged somewhat similarly to the chromosomes of eukaryotes The main bacterial genome is called the bacterial chromosome Bacterial Chromosomes Bacteria have single, multiple, linear or circular chromosomes depending on the species but a single circular chromosome is most common The DNA molecule is bound to small amounts of protein and localized to a region of the bacterial cell called the nucleoid The bacterial DNA is negatively supercoiled and folded into loops Figure 18-17 Bacterial chromosomes The loops of bacterial DNA are held in place by RNA and basic protein molecules Treatment with ribonuclease degrades RNA and releases some of the loops Nicking with a topoisomerase does not affect the loops but relaxes the supercoils Bacterial plasmids Besides the chromosome, bacteria may contain one or more plasmids, small usually circular DNA molecules containing genes for their own replication They may also carry genes for cellular functions Most are supercoiled, and though they replicate autonomously, replication is somewhat synchronous with the chromosome Types of plasmids F (fertility) factors are involved in the process of conjugation R (resistance) factors carry genes that impart drug resistance to the bacterium col (colinogenic) factors allow bacteria to secrete colicins, compounds that kill nearby bacteria that lack the col factor Types of plasmids (continued) Virulence factors enhance the ability to cause disease by producing toxic proteins Metabolic plasmids produce enzymes required for certain metabolic reactions Cryptic plasmids have no known function Eukaryotes Package DNA in Chromatin and Chromosomes In eukaryotes, more DNA is involved per cell and it interacts with more proteins When bound to proteins, DNA is converted into chromatin At the time of division, the chromatin fibers condense into a more compact structure, the chromosome Histones Histones are a group of small basic proteins with high lysine and arginine content The negatively charged DNA binds stably to the positively charged proteins The mass of histones in a chromosome is approximately equal to the mass of the DNA Types of histones There are five main types of histones, H1, H2A, H2B, H3, and H4 Chromatin contains about equal numbers of all of these except H1, which is present in about half the amount of the others Chromatin also contains a number of nonhistone proteins, which play a variety of roles Nucleosomes Are the Basic Unit of Chromatin Structure In the 1960s X-ray diffraction studies revealed that chromatin has a repeating structural subunit seen in neither DNA nor histones alone When isolated from cells, chromatin fibers appear as a series of tiny particles attached by thin filaments (“beads-on-a-string”) The “beads” are called nucleosomes Figure 18-18 Evidence for nucleosomes Chromatin can be exposed to a nuclease that cleaves DNA, and the partially degraded DNA separated from proteins Electrophoresis shows a distinctive pattern of DNA bands in repeating 200 bp intervals This pattern is not generated when DNA alone is digested and it suggests that nucleosomes occur at 200 bp intervals Figure 18-19 Further evidence for nucleosomes Chromatin can be digested with micrococcal nuclease briefly The fragmented chromatin is then separated into fractions by centrifugation The smallest fraction contains a single spherical particle, the next fraction contains two particles, and so on The particles are nucleosomes Figure 18-20, Two upper panels Figure 18-20, Two lower panels A Histone Octamer Forms the Nucleosome Core Kornberg and colleagues showed that nucleosomes can be assembled in vitro only when the histones used were isolated gently Under these isolation procedures histone dimers H2A–H2B and H3–H4 remained intact They concluded that the H2A–H2B and H3–H4 complexes were an integral part of the nucleosome More investigation of the nucleosome Kornberg and colleagues used chemical crosslinking to show that nucleosomes contain an octamer of eight histones Histone octamers contained two H2A–H2B dimers and two H3–H4 dimers, with the DNA wrapped around the octamer The octomer with 146 bp of DNA is the core particle; extra DNA from the original 200 bp is called linker DNA More investigation of the nucleosome The amount of linker DNA varies among organisms, but the DNA associated with the core particle always measures close to 146 bp This is enough DNA to wrap 1.7 times around the core particle Figure 18-21 Histone H1 might be associated with the linker region Nucleosomes Are Packed Together to Form Chromatin Fibers and Chromosomes Nucleosome formation is the first step in packaging of nuclear DNA Isolated chromatin (“beads on a string”) measures about 10 nm in diameter, but chromatin of intact cells measures about 30 nm (the 30-nm chromatin fiber) Histone H1 facilitates formation of the 30-nm fiber Figure 18-22A, B Further packing of chromatin The 30-nm fiber seems to be packed together in an irregular, threedimensional zigzag structure These fibers fold into looped domains 50,000–100,000 bp in length, attached periodically to the chromosomal scaffold Chromatin so highly compacted that it shows up as dark spots on micrographs is called heterochromatin; the more diffuse chromatin is called euchromatin Figure 18-22B, C Figure 18-23 Heterochromatin and euchromatin The tightly packed heterochromatin contains DNA that is transcriptionally inactive The more diffuse, loosely packed euchromatin is associated with DNA that is actively transcribed As a cell prepares to divide, all of its chromatin becomes highly compacted Because the chromosomal DNA has recently been duplicated, each chromosome is composed of two chromatids Figure 18-22D, E Packing ratio The packing ratio is the extent to which a DNA molecule has been folded (total length of DNA molecule/length of chromatin fiber or chromosome) It is determined by determining the entire length of the DNA molecule and dividing by the length of the fiber or chromosome into which it has been packaged The packing ratio of DNA coiled around nucleosomes is around 7 and packing DNA into the 30-nm fiber results in a further reduction, about six-fold Overall, the packing ratio of the 30-nm fiber is 42 DNA packaging ratio = extended length of DNA molecule /length of chromatin fiber 1 7 42 750 15,000 to 20,000 In chromosomes of dividing cells Eukaryotes Package Some of Their DNA in Mitochondria and Chloroplasts Mitochondria and chloroplasts have their own chromosomes, that are devoid of histones and are usually circular Though both organelles can encode some of their own polypeptides, they are dependent on the nuclear genome to encode most of them Figure 18-24 The human mitochondrion The genome of the human mitochondrion has been sequenced It is 16,569 base pairs long, and encodes 37 genes, about 5% of all the RNAs and proteins needed by the mitochondrion Figure 18-25 Mitochondrial genomes The size of mitochondrial genomes varies considerably among organisms Yeasts have mitochondrial genomes about 5X larger than those of mammals, but most of the extra DNA is noncoding A 648-nucleotide sequence sometimes called the DNA bar code can be used to distinguish closely related species Chloroplast genomes Chloroplasts usually possess circular DNA molecules of about 120,000 bp in length, containing around 120 genes Subunits of some multimeric protein complexes are encoded by the nuclear genome; this is true for both chloroplasts and mitochondria The Nucleus The nucleus is the site within the eukaryotic cell where the chromosomes are localized and replicated and the DNA they contain is transcribed It is one of the most prominent and distinguishing features of eukaryotic cells Figure 18-26 Figure 18-27 A Double-Membrane Nuclear Envelope Surrounds the Nucleus The nucleus is bounded by a nuclear envelope with an inner and an outer membrane separated by a perinuclear space The outer membrane is continuous with the ER and contains proteins that bind actin and IFs of the cytoskeleton Tubular invaginations of the envelope, the nucleoplasmic reticulum, project into the nucleus Nuclear pores Nuclear pores are specialized channels in the nuclear envelope, where inner and outer membranes are fused They provide direct contact between the cytosol and the nucleoplasm (interior nuclear space) They are lined with a protein structure called the nuclear pore complex (NPC) Figure 18-28 Nuclear pore complex The NPC is built from about 30 different proteins called nucleoporins The complex has a striking symmetry The “central granule” is called the transporter, and is likely involved in moving molecules across the nuclear envelope Figure 18-29A Figure 18-29B Molecules Enter and Exit the Nucleus Through Nuclear Pores Enzymes and proteins needed in the nucleus must be imported from the cytoplasm RNAs that need to be translated and components of ribosomes must be exported from the nucleus In addition to all this traffic through the pores, they also mediate passage of small particles, molecules, and ions Figure 18-30 Simple Diffusion of Small Molecules Through Nuclear Pores Small particles, less than 10 nm in diameter, pass through pores at a rate proportional to the size of the particle The NPC contains tiny aqueous diffusion channels through which small particles freely move Active Transport of Large Proteins and RNA Through Nuclear Pores Some proteins needed in the nucleus are too large to easily diffuse through the nuclear pores These large particles are actively transported across the membrane Nuclear localization signals (NLS) enable the protein to be recognized and transported by the nuclear pore complex The Import Process A cytoplasmic protein with an NLS is recognized by a receptor protein called an importin, which binds the NLS and mediates movement of the protein to a nuclear pore (1) The importin-protein complex is transported into the nucleus by the transporter at the center of the NPC (2) Inside the nucleus the importin associates with a GTP-binding protein called Ran, causing importin to release the NLScontaining protein (3) The Ran-GTP importin complex is transported back to the cytoplasm through the NPC (4) In the cytoplasm the importin is released as GTP is hydrolyzed (5) The export process Export occurs by a comparable process; RNA export is mediated by adaptor proteins that bind RNA The adaptor proteins bind to nuclear export signals (NES); NES sequences are recognized by exportins, which mediate transport of the complexes out of the nucleus Maintaining a Ran-GTP gradient across the membrane Ran-GTP is maintained at high levels inside the nucleus by a guaninenucleotide exchange factor (GEF) that promotes Ran to bind GTP The cytosol contains a GTPase activating protein (GAP) that promotes hydrolysis of GTP by Ran Function of the Ran-GTP gradient across the membrane The high nuclear Ran-GTP promotes the release of NLS-containing cargo from importin It also promotes the binding of NES-containing cargo to exportin Nuclear transport factor 2 (NTF2) shuttles Ran-GDP back into the nucleus The wife needs to come back home again (to pay the taxi fare, you have money only at home) The Nuclear Matrix and Nuclear Lamina Are Supporting Structures of the Nucleus The nuclear matrix (nucleoskeleton) is an insoluble fibrous network that helps maintain the shape of the nucleus The nuclear lamina is a thin dense meshwork of fibers lining the inner surface of the inner nuclear membrane It is made of intermediate filaments made from lamins Figure 18-32A Figure 18-32B Chromatin Fibers Are Dispersed Within the Nucleus in a Nonrandom Fashion Most of the time a cell’s chromatin fibers are extended and dispersed through the nucleus The chromatin of each chromosome has its own discrete location (chromosome territory) In situ hybridization, using nucleic acid probes specific for sequences specific to individual chromosomes, demonstrates this Figure 18-33 Chromatin organization Parts of the chromatin are bound to the nuclear envelope near the nuclear pores These regions are highly compacted, most of it constitutive heterochromatin (highly condensed at all times; e.g., centromeres and telomeres) Constitutive heterochromatin is composed of simple-sequence repeats Facultative heterochromatin Facultative heterochromatin varies with the activities of the cell, and so can differ from tissue to tissue It can even vary over time within one cell The Nucleolus Is Involved in Ribosome Formation The nucleolus is the place in the nucleus where ribosomal subunits are assembled Fibrils in the nucleolus contain DNA that is being transcribed into ribosomal RNA (rRNA) Granules in the nucleolus are rRNA molecules being packaged with proteins Figure 18-34 Figure 18-35 The NOR and nuclear bodies The nucleolus organizer region (NOR) is a stretch of DNA containing multiple (hundreds to thousands) copies of rRNA genes Additional bodies in the nucleus play roles in processing and handling of RNA molecules in the nucleus These are Cajal bodies, Gemini of Cajal bodies, speckles, and PML bodies