Nucleic Acids, Chromatin and Chromosomes Types Of DNAs Chromosomal DNA in the nucleus Mitochondrial DNA: We have Chloroplast DNA: Plants have Bacterial DNA (chromosomes and plasmids) Viral DNA Types of RNA Messenger RNA--translated into polypeptides Ribosomal RNA--integral part of ribosome Transfer RNA--carry amino acids in for polypeptide synthesis Small nuclear RNAs help process pre-mRNAs into mRNAs Small nucleolar RNAs help process and assemble ribosomal RNAs Ribozymes—Can function like enzyme proteins o ex. Cleave RNAs, assemble polymer (rRNA) Antisense RNAs: Some interfere with protein production, ex. microRNAs (miRNA), small interfering RNAs (siRNAs), interfering RNAs o Some enable protein production o Piwi-interacting RNAs—Suppress transcription of transposons in testes, interact with the Piwi protein (P-element Induced WImpy testis in Drosophila) o CRISPR RNAs—help bacteria and archaea destroy invading viruses and plasmids Long Noncoding RNAs: Some act as molecular decoys, and bind up proteins that would otherwise destabilize chromosomes o Others cause changes in the structure of chromosomes, inhibiting gene expression DNA Is The Molecule Of Inheritance What are the requirements for the molecule of inheritance? What molecules were possible candidates? What experiments proved that it was DNA? The Molecule of Inheritance Must Contain the info that determines all the traits & functions of the organism Be stable, but also have the ability to change in some ways without causing harm to the organism Be carried on chromosomes o The chromosomal theory of inheritance (put forth by Walter Sutton in 1902, confirmed by independent work by Theodor Boveri around the same time) was widely accepted by then The theory postulated that organisms possessed matched pairs of maternal and paternal chromosomes that separated from each other during meiosis and "may constitute the physical basis of the Mendelian law of heredity (Sutton 1902)" Protein Was The Other Possible Candidate DNA, RNA and protein were the three logical candidates o But only DNA and protein are contained in chromosomes Protein was a strong candidate on the basis of the need for variability o It was known that there were 20 amino acids used to make proteins, but only 4 nucleotides to make DNA or RNA, so it was assumed that you could make a lot more different types of proteins than DNA molecules Griffith Taught Us That One Bacterium Can Transmit Its Characteristics To Another Frederick Griffith (published in 1928) used two strains of Diplococcus pneumoniae—causes lethal pneumonia in mice o The virulent S strain (wildtype, i.e. most common form in nature) o Capsule defends them from the mouse’s immune system (and) makes them shiny and slimy) o Can be killed (therefore rendered nonvirulent) by heating The nonvirulent R strain (mutant strain) o Makes no capsule o Killed by the mouse’s immune system Mice injected with the S strain died, living S type bacteria recovered from the bodies o Mice injected with the R strain survived o Mice injected with a heat-killed S strain survived o Mice injected with a mixture of R strain and heat-killed S strain died, living S type bacteria recovered from the bodies Griffith demonstrated that there was a “transforming principle” that could allow one bacterium to acquire the characteristics of another The R strain probably only needed to acquire a good copy of one gene from the S strain, to compensate for its mutation and enable it to make the capsule Avery, MacLeod and McCarty Showed That DNA Was The Transforming Principle Griffith’s lab (in London) was bombed in 1941, during World War II Oswald Avery, Colin MacLeod and Maclyn McCarty continued Griffith’s work in America Avery, MacLeod and McCarty (published in 1944) showed that DNA, not RNA or protein, was the transforming principle Avery, MacLeod and McCarty performed experiments similar to Griffith’s experiment, but they treated independent samples of the heat-killed S bacteria three ways o With DNAse to destroy the DNA o With RNAse to destroy the RNA o With proteases to destroy the proteins When the DNAse-treated S bacteria were mixed with the R bacteria, the mice that were injected with the mixture survived Transformation did not occur after the DNA was destroyed—DNA is the transforming principle Avery, MacLeod and McCarty Also Showed The Transforming Principle Was Heritable The virulent S bacteria that were recovered from the dead mice were allowed to grow in culture and reproduce o The offspring bacteria were all virulent and had capsules Hershey And Chase Showed That A Bacteriophage Injects Its DNA, Not Protein, Into Its Target Bacteria Alfred Hershey and Martha Chase (published 1952) used the T2 bacteriophage, which is composed of a DNA core surrounded by a protein coat They grew two sets of E. coli bacteria o One in medium with radioactive sulfur, which caused the bacterial proteins that contained the sulfur-containing amino acids methionine and cysteine to be radioactive o One in medium with radioactive phosphorus, which caused all the nucleotides in the bacterial nucleic acids to be radioactive Hershey and Chase allowed separate sets of T2 bacteriophages to infect the two types of E. coli bacteria o The T2 phages infected the bacteria, and made new T2s by using the bacterial amino acids to make the new T2 proteins, and the bacterial nucleotides to make the new T2 DNA The progeny viruses from the T2 phages that infected bacteria that had been grown in the radioactive sulfur medium had radioactive proteins The progeny viruses from the T2 phages that infected bacteria that had been grown in the radioactive phosphorus medium had radioactive DNA Hershey and Chase allowed the two sets of progeny T2 bacteriophages to infect different new batches of E. coli bacteria that had not been grown in any special medium The protein coat of the T2 bacteriophage stays outside on the surface of the bacterium after the phage infects the virus o After the phages infected the bacteria, Hershey and Chase sheared off the protein coats from the bacteria, and centrifuged the mixture, so that the infected bacterial cells formed a pellet at the bottom, and the protein coats were in the supernatant liquid Where the E. coli had been infected by progeny viruses with radioactive sulfur, the radioactivity was found in the supernatant liquid, and not in the infected bacterial cells Where the E. coli had been infected by progeny viruses with radioactive phosphorus, the radioactivity was found in the infected bacterial cells, and not in the supernatant liquid The progeny viruses had injected their DNA, not protein, when they infected the E. coli Hershey & Chase Experiments Viral material is injected into bacteria here Knock viral protein coats off bacteria Infected bacteria form pellet Radioactive DNA found in bacterial cells, radioactive viral protein coats found in supernatant *supernatant: The soluble liquid fraction of a sample after centrifugation or precipitation of insoluble solids. Several Lines Of Research Converged On The Concept Of A Double Helix Edwin Chargaff demonstrated that the number of A bases = T bases and C bases = G bases, setting the stage for the concept of complementary base pairs o This is referred to Chargaff’s rule Rosalind Franklin and Maurice Wilkins studied X-ray diffraction of DNA, discovered that: o DNA is helical and made of 2 parallel parts o The helix contains 10 nucleotides (nt) per turn o The helix has a diameter of 20 angstroms (2 nm) Watson and Crick (published in 1953) proposed that the two DNA strands lie in an antiparallel arrangement—they lie parallel to each other, but run in opposite orientation (like the lanes of a street) o They proposed that bonds between the A-T bases and C-G bases hold the two strands of the double Helix together o They also noted that the concept of complementary bases immediately suggested a mechanism for DNA replication Structure of Nucleotides Sugar + Nitrogenous base= nucleoside Nucleoside + phosphate group= nucleotide The Bases In DNA And RNA DNA contains adenine, cytosine, guanine, thymine (ACGT) RNA contains adenine, cytosine, guanine, uracil (ACGU) A Nucleic Acid Strand Has A 5’ End And A 3’ End A strand of DNA or RNA is said to have a 5’ end and a 3’ end When a chain of nucleic acid is extended (ex. when a gene is used to make RNA or when DNA is replicated), a new nucleotide is added by hooking the phosphate group on the 5’ carbon of the new nucleotide onto the O that is on the 3’ carbon of the previous nucleotide o Therefore, the chain of nucleotides has a 5’ phosphate at the “front” end, i.e. the first nucleotide that was put in the chain--the 5’ end o It also has an OH on the “tail” end—the final nucleotide to have been added--the 3’ end RNAs are single stranded Note the antiparallel arrangement of the two DNA strands Note that there are 2 H bonds in an A-T pair and 3 H bonds in a C-G pair The Double Helix Sugar-phosphate backbones of each strand are on the outside of helix o Covalent bonds between sugar and phosphate hold nucleotides together in a single chain o Covalent bonds stronger than hydrogen bonds In order to open up the double helix Bases (A, G, C, T) are stacked inside o Hydrogen bonding (non-covalent) between complementary bases keeps 2 DNA strands together Major & minor grooves are formed o Important for protein binding RNA Structure RNA is single-stranded, but can fold on itself and adopt a number of conformations o If two stretches of bases have complementary sequences, they will bind to each other (does not have to be a perfect complementarity) Stem-and-loop hairpins Right-handed double helices Internal loops and bulges Transfer RNA Structure tRNAs often contain modified nucleotides, o Ex) methylguanosine tRNAs are characterized by an anticodon, which binds the mRNA at the ribosome, and an amino acid-carrying portion Ribosomal RNA Structure More complex than tRNAs, but with similar structural elements Nucleoside Analogs Can Be Used To Battle Cancer In cancer, a set of cells is replicating and dividing at a fast rate o Some viruses (ex. HIV) inject RNA, and then use reverse transcriptase to make a DNA copy of the RNA o These processes require the cells to synthesize a lot of DNA Anti-HIV therapy uses nucleic acid derivatives such as 3’azidodeoxythymidine (AZT) and 2’, 3’-dideoxyinosine (DDI) o Their 5’ ends look enough like real nucleotides that the replicating cells incorporate them into the newly synthesized DNA, but they have no 3’ O, so the cell cannot extend the DNA strand any farther Nucleoside Analogs Can Be Used To Battle Cancer The 5’ OH group gets phosphorylated, thereby making it look like a regular nucleotide’s 5’ end o Hydroxy groups gets phosphorylated easily The 3’ C does not have an O to hang another nucleotide onto—once the nucleoside analog is incorporated, the DNA chain cannot be extended—because the cell cannot replicate its DNA, it dies o Cancer involve uncontrolled and rapid growth. So if you shut down replication, can stop the growth of that cancer Mitochondrial And Chloroplast DNA Is Circular Circular Has roughly 3 dozen genes If a mutation occurs, can produce several very different disease Makes some of the protein that are involved in E.T Mitochondrial and chloroplast genes encode rRNAs, tRNAs and proteins that are essential for the function of the mitochondrion Mitochondria need a number of proteins that are made by nuclear genes in order to function Over evolution, genes appear to have transferred from the mitochondrion to the nucleus, and from chloroplasts to mitochondria A Few Definitions Dinucleotide = two nucleotides Trinucleotide = three nucleotides (etc) Oligonucleotide = less than 40 (approximate) Denaturing = separating the two DNA strands, often using heat to break the H bonds between bases (often called melting the strands) o A sequence with high G-C content has a higher melting temperature than one with a high AT content Annealing = complementary single strands of DNA H-bond back together Chromatin and Chromosome Bacterial (Chromosomal) DNA Is Attached To Proteins Instead of a nucleus, a bacterium has a nucleoid region The DNA forms loop domains, each of which is anchored by DNA-binding proteins o The DNA in each loop domain is supercoiled Bacterial (Chromosomal) DNA Is Attached to Proteins Bacterial DNA Is Supercoiled Topoisomerases catalyze changes in chromatin configuration o In most bacteria topoisomerases negatively supercoil the DNA, and nucleoid associated proteins (NAPs) hold it in the proper configuration Bacterial DNA Can Be Supercoiled in Either Direction Most bacteria have a circular DNA molecule, which may be in the relaxed or supercoiled configuration Bacterial chromosomes can be positively (over coiled) or negatively (under coiled) supercoiled Bacteria May Also Contain Plasmids A plasmid is a small, circular piece of DNA that replicates independently from the bacterium itself Plasmids can be replicated, and the copies transferred to another bacterium, by a process known as conjugation Plasmids do not contain genes that are essential for life, but they may contain genes that encode proteins that help them survive and propagate o Ex. Resistance factors = proteins that protect them against antibiotics o Fertility factors = proteins that enable them to engage in conjugation with other bacteria more effectively Plasmids Are Copied and Transferred During Bacterial Conjugation DNA Is Supercoiled So It Can Fit Inside A Cell's Nucleus Enzymes called topoisomerases temporarily cut the DNA strands and rotate the ends of the DNA so it can coil The DNA is wound around a cluster of eight histone proteins to make a nucleosome o Histone 1 protein locks the DNA around the histones Nucleosome plus H1 = chromatosome A stretch of linker DNA connects the chromatosomes The string of chromatosomes is coiled many times Scaffold proteins help hold the coils together** The combination of DNA and proteins is called chromatin Eukaryotic DNA Is Wound Around Histone Proteins If held in a straight line, the human DNA molecule is approximately 6 feet long o It takes approximately 6500 nuclei, side-by-side, to span one inch The Interaction Between DNA And Histone Proteins Helps Regulate Gene Activity In order for DNA to replicate or a gene to make its mRNA, the histones must loosen their grip on the DNA The best-understood mechanism involves acetylation of lysines in the histone proteins o DNA has a negative charge; the lysines in histone proteins have positive charges o Histone acetyltransferase puts acetyl groups on the histone proteins (histone deacetylase takes them off) Putting a negatively charged acetyl group on the lysine causes the histones to release the DNA a little, and allow it to open up so the proteins that are needed for transcription or replication can access the DNA Coiling The DNA Aligns Regulatory Sequences For some genes, there may be several regulatory sequences, and they may lie far from the promoter Chromatin Relaxes to Allow Gene Expression In order for a gene to make its mRNA, the histones must loosen their grip on the DNA Most Eukaryotic Chromosomes Are Linear **Eukaryotes Have Metacentric, Submetacentric, Acrocentric And Telocentric Chromosomes Some Eukaryotes Have Polycentric Chromosomes Some organisms’ chromosomes have centromeres distributed along the chromosome; spindle fibers attach all along the chromosome o Ex) Caenorhabditis elegans, a well-known nematode (roundworm) and Lesser woodrush The Centromere Has A Characteristic Repeated Sequence The centromere has a lot of repeated sequences Different organisms have different specific centromeric sequences o Repeated sequences at human centromeres span hundreds of thousands of bp Centromeres contain a variant form of the histone H3, and adopt a characteristic chromatin configuration--this is what allows them to serve their functions Spindle Fibers Grab The Chromosomes’ Centromeres The centromere has a specialized protein structure around it called the kinetochore The spindle fibers bind with the kinetochores, so the two members of each homologous chromosome pair can get pulled apart during the anaphase I stage of meiosis, and the two sister chromatids can get pulled apart during anaphase II of meiosis or anaphase of mitosis Anaphase Spindle fibers (green) draw the chromosomes (blue) apart Telomeres Must Be Stabilized The human telomere repeat sequence is 5’-TTAGGG-3’ 3’AATCCC-5’ (telomeric end) One strand projects beyond the other strand o Protection Of Telomere protein binds to the single-stranded DNA The telomeric end of the long strand binds to itself at a complementary sequence to form a t-loop A multi-protein complex called shelterin binds the telomeres and keeps them from being replicated (in most organisms) Euchromatin and Heterochromatin Each chromosome has regions in which the chromatin is more condensed than it is in other regions Euchromatin is less condensed (although still supercoiled), and most of the genes that are located in the euchromatic region are active Heterochromatin exists in a state that is even more condensed than euchromatin, and the genes that lie in heterochromatic regions are usually inactive Chromosomes can break and rearrange themselves If a chromosome rearrangement moves a gene too close to a heterochromatic region, this may silence the gene (position effect) Gene Regulation in Prokaryotes Negative Control Of The lac Operon: The Controller Of Inducibility The lac operon is inducible--The cell does not need the lac operon to be active unless there is lactose present in the environment The lacI gene produces a repressor protein (in an active form) and is constitutive—always producing the repressor o When lactose is present, the lactose is converted to allolactose Helps cell produce energy o Allolactose binds to the repressor and prevents it from binding to the operator o Transcription occurs, the proteins are made, lactose gets catabolized (inducible) Don’t need to turn on until you have the thing that needs to be metabolized There are also anabolize which are repressible Want on all the time (making critical nutrient) and when not needed then it makes sense to turn it off so you don’t over produce o When the level of lactose drops, the repressor is able to repress transcription again Mutations In The lac Operon Revealed The Means By Which The Operon Works Jacob and Monod discovered the mechanism for gene regulation in the lac operon using E. coli strains that had mutations in different portions of the lac operon o They used partial diploid cells (aka merozygotes)--cells that had taken up a plasmid that contained its own copy of the lac operon, and therefore had two copies of the lac operon Regular genomic DNA and one in plasma o Genotypes written as, ex. I+ P+ O+ Z+ Y- / I- P+ O+ Z+ YRegular chromosome / Plasmid o + = wild type sequence, functional - = mutant sequence, not functional o Promoter=P, Operon=O, I=Regulatory gene, Z and Y= Genes o If a partial genotype is given, all unspecified elements are functional (+) o Anything you don’t see is functionally good The Z And Y Genes (And Presumably The A Gene) Are (Mostly) Independent Of Each Other Jacob and Monod discovered that the lacZ and lacY genes worked independently of each other A bacterium with the lacZ+ lacY- / lacZ- lacY+ genotype could make both beta-galactosidase and permease A mutation in one copy of the beta-galactosidase or permease gene did not affect protein production from the other copy of that gene (cis-acting factor) The cell had one working copy of each gene, and could metabolize lactose The Z, Y and A genes are not completely independent of each other o If something stops translation while the ribosome is translating the Z gene’s mRNA, the Y and A mRNAs will not get translated o Ribosomes get knocked off A nonsense mutation in one of the operon’s genes can cause there to be no translation of any of the downstream genes--> not completely independent Missense--> Ribosome continues to translate mRNA and produce protein down stream Mutations In The lac Operon Revealed The Means By Which The Repressor Works Jacob and Monod found one strain of E. coli that had a mutation in the lacI gene that caused the repressor protein to be inactive, so transcription was always on (lacI- genotype) A bacterium with the lacI+ lacZ- / lacI- lacZ+ genotype o did not produce beta-galactosidase in the absence of lactose, which means the repressor protein from the main chromosome’s I gene could diffuse to bind to the other copy of the operon in the plasmid (i.e. the repressor is a trans-acting factor) o Produces beta-galactosidase only when lactose is present, because the functional repressor molecules that are made by the main chromosome’s I gene must be inactivated by the allolactose in order for there to be transcription There are superrepressor mutations that prevent the inducer (allolactose) from binding to the repressor, leaving the repressor always active, and the inducer unable to induce transcription (genotype = lacIs) A bacterium with the lacIs lacZ+ / lacI+ lacZ+ genotype, or even the lacIs lacZ+ / lacI- lacZ+ genotype, does not produce beta-galactosidase, even when lactose is present o The superrepressor protein is a trans-acting factor, and will repress both copies of the operon, even in the presence of lactose *Because the repressor is a diffusable molecule, even if you have one I+ that repressor can be made in the main comosome and diffused into the plasmid; One working regulating gene can repress in both places Mutations In The lac Operon Revealed The Means By Which The Operator Works Jacob and Monod found a strain of E. coli that had a constitutive mutation in the operator (lacOc genotype) that prevented the repressor from binding—transcription was always on o Has a mutation that will keep occurring; operon always on o Continusouly making Z and Y A bacterium with the lacOc lacZ+ / lacO+ lacZ+ genotype produces beta-galactosidase all the time, even when lactose is absent, because the repressor cannot bind the operator A bacterium with the lacI+ lacO+ lacZ- / lacI+ lacOc lacZ+ genotype produces beta-galactosidase all the time, even when lactose is absent, because the plasmid’s Oc mutation drives transcription of the plasmid’s Z+ gene A bacterium with the lacI+ lacO+ lacZ+ / lacI+ lacOc lacZ- genotype produces beta-galactosidase only when lactose is present, because the plasmid Z- gene will not make functional beta-galactosidase under any conditions; this nullifies the Oc mutation’s effect on beta-galactosidase production The constitutive operator mutation can only cause beta-galactosidase to be produced constitutively if the lacOc and lacZ+ genes lie together—the operator is a cis-acting element *If you have both Oc and superrespressor, Oc wins because repressor cant bind to repressor to being with so it doesn’t matter if lactose cant take repressor off operator, still get constant mutation Mutations In The lac Operon Revealed The Means By Which The Promoter Works Promoter mutations that prevent the RNA polymerase from binding to the promoter demonstrated that the promoter is a cis-acting element (lacP- genotype) A bacterium with the lacI+ lacP+ lacZ+ / lacI+ lacP- lacZ+ genotype produces beta-galactosidase normally, when lactose is present (from the main chromosome) The promoter mutation in the plasmid does not prevent transcription in the main chromosome— the promotor is a cis-acting element The lac Operon Also Exhibits Positive Control—Catabolite Repression When glucose is present, the bacterium will prefer to use it for energy versus lactose, and will shut down the lac operon o This is a form of positive control There is an activator binding site a little way upstream from the lac operon o The cyclic AMP receptor protein, or CRP, is the activator, but it needs to be bound to cAMP in order to be active o When glucose is present, cAMP concentrations are low (glucose inhibits adenylate cyclase, which makes cAMP), there is little CRP-cAMP complex to bind to the activator binding site, and transcription stops As glucose drops, cAMP rises, allowing CRP and cAMP to bind, resulting in activation of the lac operon The trp Operon—A Negative Repressible Operon The trp operon contains five structural genes that work together to synthesize the amino acid tryptophan The cell needs the trp operon to be active, except when there is ample tryptophan present The regulatory gene makes a repressor (usually on. Active) protein, which is made in an inactive form Tryptophan is the corepressor When tryptophan is present, it binds the repressor and allows the repressor to bind to the operator; transcription stops until the level of tryptophan decreases Multiple Operons Can Be Controlled By One Activator/Repressor These activators and repressor proteins diffuse throughout the nucleoid region, and can contact all the bacterium’s genes (i.e. they are trans-acting factors) One activator or repressor can therefore control the activity of a large number of genes scattered throughout the genome, as long as their operators have activator or operator sequences that can all bind the same activator/repressor o Regulon--> Controlled by the same regulatory gene that expresses a protein acting as a repressor or activator Each version of the sigma factor enables RNA polymerase to bind to a number of gene’s promoters * Trans-acting element is usually a DNA sequence that contains a gene. This gene codes for a protein (or microRNA or other diffusible molecule) that will be used in the regulation of another target gene. Cisacting elements, on the other hand, do not code for protein or RNA. *ESSAY QUESTION *A Second Means Of Negative Control--Transcription Of The Trp Operon Can Be Attenuated The trp operon contains the leader gene (trpL) that lies between the operator and the five structural genes whose proteins synthesize tryptophan RNA polymerase transcribes the leader mRNA, and the ribosomes translate the leader mRNA, creating the leader peptide The concentration of tryptophan determines whether the ribosome can smoothly translate the leader mRNA This in turn determines whether the structural genes of the trp operon get transcribed The trp operon sequence contains several critical regions: o Region 1—in the leader sequence--contains two consecutive tryptophan codons o Regions 2, 3 and 4—Region 3 can form a stem-loop structure by binding with Region 2 or Region 4, but not both (prefers to bind to Region 2) *Transcription of leader sequence and then translation (or attempt) of leader sequence to get leader peptide. How well determinates where region 3 binds to and determines whether transcriptions happens or not The RNA polymerase binds just after Region 4 to transcribe the structural genes o If Region 3 and 4 bind, the stem-loop blocks the RNA polymerase binding site o If Region 2 and 3 bind, Region 4 is open, and the RNA polymerase can bind and transcribe the structural genes o If tryptophan levels are high, the ribosome has no trouble reading through the two tryptophan codons, because there is plenty of tryptophan-tRNAs available o The ribosome translates all the way to the STOP codon on the trpL mRNA, which allows it to cover Region 2 and block formation of the stem-loop between Regions 2 and 3 Region 3 therefore binds with Region 4, and this prevents transcription of the structural genes of the trp operon If tryptophan levels are low, the ribosome stalls as it tries to translate the leader peptide, because it has trouble finding the tryptophan-tRNAs it needs This allows the stem-loop to form between Regions 2 and 3, which allows RNA polymerase to bind around Region 4 and transcribe the structural genes of the trp operon If the stem-loop forms between Region 3 and 4, transcription is attenuated Some Archaeal Transcription Factors Can Activate And Repress Transcription Ex. Pyrococcus furiosus—Lacks a glucose transporter but can perform glycolysis (needs to convert other sugars to glucose) The TrmBL1 protein simultaneously represses genes that make transport proteins for other sugars, and activates genes for gluconeogenesis (synthesis of glucose) TrmBL1 binds to a site downstream of the B recognition element and TATA box of the genes that encode maltodextrin and maltose/trehalose, and prevents RNA polymerase from transcribing them TrmBL1 also binds to a site upstream of the B recognition element and TATA box of genes that encode enzymes that synthesize glucose, and recruits TBP, TFB and RNA polymerase to the site, activating transcription *Important that some transcription factors can activate and repress transcription.. That all you need to know Signal Transduction Pathways (aka Two-Component Regulatory Systems) Activate Transcription Factors STEPS 1. Molecules from the environment bind to the extracellular domain of a transmembrane protein that is a sensor kinase—When the signal molecule binds, the sensor kinase phosphorylates itself (autophosphorylation) a. Polar binds to extracellular receptor, biochem cascade is kicked off in cell, turns the appropriate genes on b. Non polar come across membrane and find receptor protein within the cell 2. The phosphate group is transferred to the response regulator, which is a transcription factor that is activated by phosphorylation 3. The response regulator then activates/inhibits its target genes Microbes Activate Genes Through Quorum (required numbers present) Sensing Ex. Bacteria that are luminescent (glowing) o Each bacterium secretes an activator called AHL, which can activate several genes that make a fluorescent protein called luciferase o When there are only a few bacteria, the AHL diffuses into the environment and does not enter the bacterial cells o Once enough bacteria accumulate, the AHL concentration in the environment gets high enough for the AHL to diffuse into the bacterial cells, and the AHL turns on the genes that make luciferase—the bacteria glow *Deep sea creature; shine light so shadow is invisible; a lot of bacteria that make you sick do so with quorum sensing, genes that make proteins that help make you sick Several Virulence Factors Are Activated Via Quorum Sensing Ex. Staphylococcus aureus (causes serious wound infections and pneumonia)—Quorum sensing enables them to secrete their toxins They make the inducer AIP, and transport it out of the cell When enough bacteria are present, the AIP binds to the sensor kinase ArgC, causing it to autophosphorylate The phosphate is transferred to the transcription activator ArgA, which induces activity in genes that encode virulence proteins that help the bacterium adhere to and invade your cells, as well as secrete toxins Adapting To Hard Times—The Stringent Response When the concentration of nutrients gets low in an environment, this may initiate the stringent response in bacteria Synthesis of rRNA and tRNA and ribosomes stops Protein and DNA synthesis stops Amino acid synthesis increases, which enables synthesis of new proteins to compensate for the lack of certain nutrients in the environment o Ex. must make enzymes to synthesize amino acids that are now no longer available *Halt the process of protein synthesis usual protein, stop replicating DNA, make new amino acids/proteins Translation Can Also Be Regulated By Antisense RNAs Antisense RNAs (aka siRNA, microRNA, interfering RNA, small RNAs) have sequences that are complementary to certain genes’ mRNAs; they can bind to these mRNAs and either increase or decrease translation, depending on which mRNA you are talking about o Example: Enabling Translation By Altering The Secondary Structure Of The mRNA The RpoS mRNA has a secondary structure near the 5’ end that gets cleaved by an RNAse, making it impossible for a ribosome to translate the mRNA The DsrA siRNA binds to the 5’end of the RpoS mRNA, changing the mRNA’s secondary structure and changing the cut site for the RNAse, enabling translation The DsrA (downstream region A ) Gene Encodes An Antisense RNA That Enables Protein Production o Example: Decreasing Translation The ompF gene encodes a channel that allows water and ions to pass into the cell When the environment has a high osmolarity, the micF gene (encodes mRNAinterfering complementary RNA) is activated Produces an antisense RNA The micF RNA binds to the 5’ region of the ompF mRNA, inhibiting ribosome binding and therefore translation This prevents the cell from accumulating too high a concentration of ions *In hypotonic, don’t want molecules coming in The micF (Messenger RNA-Interfering Complementary RNA) Gene Encodes An iRNA Translation Can Also Be Regulated By Riboswitches Many RNAs contain regions where proteins and other molecules can bind and control whether translation takes place or not— these regions are called riboswitches, because they switch translation on and off This is often an example of end product inhibition—the end product of a synthetic pathway is the regulatory molecule— when it binds, this inhibits ribosome binding to the mRNA, and therefore inhibits protein production *Changes structure of mRNA Translation Can Also Be Regulated By Ribozymes And End-Product Inhibition Many RNAs contain regions where other molecules can bind and cause the mRNA to cleave itself—these RNAs are called ribozymes, because they act as enzymes when stimulated These RNAs make genes that synthesize various molecules When there is a high concentration of the product molecule, it binds to the ribozyme and causes the mRNA to cleave itself o No translation Regulation of Gene Expression in Eukaryotes Constitutive (Housekeeping) Genes And Inducible/Repressible Genes All your cells need certain essential housekeeping functions performed Constitutive (aka housekeeping) genes are always on, at all times, in all cells o Ex. Actin and other genes that make components of the cytoskeleton Genes that make the enzymes that carry out glycolysis Other genes are inducible and repressible; their level of activity changes as your needs change o Ex. Metallothionein and genes whose proteins clear metals can be induced by the presence of metals in the body Tissue-Dependent Gene Regulation Is Responsible For Differentiation Of Your Different Tissue Types In order for you to have the variety of tissue types you have, you must have a highly refined system whereby your genes are regulated All your cells have the same genes in them, but only a certain subset of the genes in any one cell ever get turned on The subset of genes that get turned on in the cell determine what proteins the cell has, which in turn determines what properties the cell has. o Some of this is due to tissue-specific epigenetic differences in genes (more on epigenetics later) Your Cells Become More Specialized With Every Cell Division The subset of genes that get turned on in the cell determine what proteins the cell has, which in turn determines what properties the cell has *Mutations can occur Time-Dependent Gene Regulation--Globin Genes Gamma-Hb has higher affinity for O2 than beta-Hb. This allows the fetus’ Hb to take O2 out of the mother’s blood. *Some genes are on during certain points of you life, other no Differences In Gene Activity Levels Are Part Of The Basis For Our Biochemical Individuality Like all biological traits, gene regulation is variable between individuals o The level of activity of the typical gene varies from individual to individual, which means the level of activity in most proteins in the body varies from individual to individual This is one of the things that makes you the unique individual you are—we each have our own unique combination of high level activity in some genes, middling level activity in others, and low level of activity in others o Differences in the activity level of our genes probably accounts for far more of our interindividual differences than differences in coding region sequences do Gene Expression Is Regulated at Several Stages Of The Process Because these processes do not involve changing the sequence of bases in the gene, these are generically referred to as epigenetic factors DNA Packaging: Chromatin configuration changes to make the promoter region more/less accessible to transcription factors Daculative Heterochromatin: Some heterochromatic regions will be packaged as heterochromatin in one cell type, Transcription Initiation: Interactions between regulatory sequences and transcriptional activators/inhibitors RNA Processing: Alternative cleavage and splicing of pre-RNAs RNA Stability: Poly-A tail and interfering RNAs regulate RNA longevity Translation Initiation: 5’ and 3’ UTRs and interfering RNAs regulate the rate of translation Transcription Is Regulated By Both Cis-Acting And Trans-Acting Elements Cis-acting elements lie on the same chromosome as the gene o Ex. promoters, enhancers, insulators and other sequences that lie in the regulatory regions Trans-acting elements are produced elsewhere and bind the regulatory sequences o Ex. activators, repressors that bind to the promoter region, or insulator-binding proteins The transcription factor protein complex and RNA polymerase (the basal transcription apparatus) bind to the core promoter—this provides the ability to transcribe, but at a minimal rate o Transcriptional activators and repressors bind to the regulatory promoter or the enhancers—they can increase the rate of transcription, or keep the gene silenced Transcription Is Regulated By Cis-Acting And Trans-Acting Factors Signal Transduction Pathways Activate Transcription Factors Your body releases hormones that cause cells that have the appropriate receptor proteins to activate and inhibit specific genes STEPS 1. The hormone binds to the extracellular domain of the receptor protein (the receptor is usually on the surface of the cell) 2. The receptor protein has an intracellular domain that initiates a biochemical cascade, often involving phosphorylation of proteins inside the cell 3. The biochemical cascade alters a transcription factor, and allows it to enter the nucleus and activate/inhibit its target genes *Faster response system Response Elements Enable Eukaryotes to Regulate Multiple Genes Simultaneously A response element is a promoter sequence that is found in the promoter regions of several genes whose proteins must work together GRE = Glucocorticoid response element MRE = metal response element TRE = tetracylcine response element Stress related; transcription factor binding site Insulators Can Prevent Gene Activation By Transcription Factors Chromatin Configuration Regulates Gene Activity Recall that your chromatin has regions of heterochromatin that are more condensed than their euchromatin counterparts, and that the supercondensation that occurs in regions of heterochromatin silences the genes in that region In order to allow genes to be expressed, even in the euchromatic regions, the local chromatin must be rearranged to allow the transcription factor proteins and RNA polymerase to access the gene’s promoters In order for transcription to occur, transcriptional activators must bind to the appropriate sequences in the gene’s promoter region o The more tightly the chromatin is compacted, the more difficult it is for the transcriptional activators to access the gene’s promoter region Recall that your chromatin has regions of heterochromatin that are more condensed than their euchromatin counterparts, and that the supercondensation that occurs in regions of heterochromatin silences the genes in that region o In order to allow genes to be expressed, even in the euchromatic regions, the local chromatin must be relaxed enough to allow the transcription factor proteins and RNA polymerase to access the gene’s promoter region DNAse I Sensitivity Mirrors The Level Of Transcriptional Activation In The Region DNAse I is an enzyme that cuts DNA next to each pyrimidine nucleotide (C and T)—Cuts DNA down to approx. 1-8 bp fragments When chromatin is relaxed, it is more sensitive to DNAse I o More transcriptionally active regions are more sensitive to DNAse I *Note—This is used as a laboratory assay—it is not involved in regulation of transcription In Drosophila, Chromosome Puffs Appear In Transcriptionally Active Regions In some species, chromosome puffs form in regions in which several genes are active Chromatin Configuration Regulates Gene Activity Factors that influence chromatin configuration are referred to as epigenetic factors Epigenetic factors influence the level of activity in a gene without changing the base sequence of the gene The best-known chemical modifications are: o Methylation of histones (often the lysines) o Acetylation of histone proteins (often at positively charged lysines) and o Methylation of DNA (often the Cs in CG islands in the promoter region) Methylation Of Histones Regulates Transcription Methylation can enable the activation or repression of gene activity, depending on the specific amino acids that get methylated o Ex. 3 methyl groups (CH3) added to lysine number 4 in histone 3, aka H3K4me3 (K = lysine) enhances transcription o Some of the proteins that bind to H3K4me3 decondense the local chromatin Acetylation Of Histone Proteins Makes The DNA Accessible To Transcription Factors Chromatin-Remodeling Complexes Can Also Control Gene Activity Chromatin-remodeling complexes are complexes of transcription factors and other proteins that can move nucleosomes around, thereby exposing promoter sites so transcription factors can bind them o Some slide the nucleosome down the DNA a bit, exposing a promoter region o Some change the conformation of the DNA and/or nucleosome, exposing a promoter In Some Cases The DNA Gets Methylated In some cases, the DNA itself gets methylated, which usually inhibits transcription of the genes in that region o One of the better-known mechanisms for silencing transcription involves methylation of cytosines in the promoter regions of genes o This prevents transcription factors from binding; transcription does not occur In most eukaryotes, 2-7% of all cytosines are methylated o This doesn’t happen in Drosophila or yeast, so it is not a universal control mechanism Methylating cytosine is one of the best ways to silence genes Increased Trinucleotide Repeat Length Increases Methylation And Leads To Fragile X Syndrome The FMR1 gene has a trinucleotide repeat (CGG) in the promoter region The length of the repeat is highly variable; 50-58 CGG repeats or below conveys normal gene activity The repeat can expand during meiosis—someone with 60 CGG’s in the repeated string can have a child with 250 CGG’s in the repeated string o Too many (> 200) CGG’s --> increased methylation of the gene --> deacetylation of histone proteins --> inhibition of transcription of FMR1 --> Fragile X mental retardation syndrome X-Inactivation Silences Most Of The Genes On One X Chromosome In Humans In mammals, any individual that has more than one X chromosome inactivates one of the Xs o Apparently it is deleterious to development to have two active copies of every gene on the X chromosome Some cells inactivate the X that came from Dad; other cells inactivate the X that came from Mom o Most, but not all, of the genes on that X are silenced: Approx. 75% are completely silenced, approx. 15% escape inactivation (full activity), and for the other approx. 10% the degree of silencing is variable between individual women The X that gets inactivated produces an RNA called X-inactivation specific transcript, XIST o The XIST RNA coats the chromosome, supercondenses it and fosters methylation of promoter region Cs The other X protects itself by producing the TSIX RNA, which is complementary to the XIST RNA o The TSIX RNA binds the XIST RNA, preventing the XIST RNA from inactivating the second X **64 cells; 1/2 are inactivated; only in female When you look at the chromosomes under the microscope, the super-condensed, inactivated X chromosome looks like you balled up a sheet of paper and threw it on the ground For a while when people first started looking at chromosomes under microscopes, they thought this was just an artifact—a piece of junk they could not get off the slide It is now called the Barr body, after Murray Barr, who first figures out that it was actually the other X chromosome in a female’s karyotype Imprinted Genes Are Regulated By Differential Methylation Of The Promoter Regions You are supposed to have only one active copy of an imprinted gene o For some imprinted genes, the copy you inherited from your mother is the active copy, and for other imprinted genes, the copy you inherited from your father is the active copy In both cases, everyone inactivates the same copy of the gene, either the maternally derived or paternally derived copy May have more descendants inactivated dad or mom x chromosome Screws inactivation over time; may be more than 50% Many promoters have CG islands (aka CpG islands)--a cluster of Cs and Gs, with a lot of CG dinucleotides; methylation of the Cs that are followed by Gs represses transcription—note this is methylation of the DNA itself, not the histones *An imprinted gene is a gene you're only allowed to have one ACTIVE copy from the 2 inherited. Methylation of the promoter region determines what's silenced Methylation Patterns Must Be Reset During Gametogenesis Half of your genes are methylated with a pattern that marks them as having come from a parent whose sex is the opposite of yours During spermatogenesis and oogenesis, the DNA is completely unmethylated, then re-methylated, in the pattern that is appropriate for a gene that comes from the parent of your sex o If the methylation patterns are not established properly, the child will have either two or zero working copies of its imprinted genes, instead of the one working copy he/she should have DNA Methylation And Histone Deacetylation Are Coupled Methylated DNA binding proteins have a domain that binds methylated DNA and a domain that has histone deacetylase activity—they tighten the chromatin in the regions where promoter region Cs are methylated o Ex. MECP2—X-linked gene—Mutations have sex-influenced effects o Cause Rett syndrome in females--normal development for approx. 6 months then regression of cognitive capabilities and head ceases to grow, resulting in microcephaly o Cause an extremely variably expressed syndrome featuring cognitive impairment and other CNS features in males Where mutation is in the gene that affect that determine severity Epigenetic Factors Are Variable Between Individuals Variation in the degree of activation/suppression by epigenetic factors is responsible for a lot of inter-individual variability o As identical twins age, they develop different traits, as well as different patterns of DNA methylation and histone acetylation Personalized medicine tests are starting to take epigenetic factors such as the methylation of DNA into account as factors that influence your risk for disease or response to drugs o Trying to treat the specific person rather than the illness Epigenetic Factors Are Influenced By Early Life Experience, Including Maternal Behavior Recently it has been shown that the level of methylation of the DNA is influenced by the amount of maternal licking the rat receives during infancy The effect is abolished if you infuse the young rats’ brains with histone deacetylase inhibitors Effects can be seen on the glucocorticoid receptor gene/protein and the offsprings’ responses to stress These effects persist into the offsprings’ adulthood Epigenetic Factors Can Be Influenced By Diet In bees, some female larvae eat honey (the usual bee food), and develop into worker bees Some females are fed royal jelly; they develop into queens o The royal jelly silences the DNA methyltransferase 3 gene (Dnmt3); this changes the methylation of, and expression of, a number of genes, causing the female to develop as a queen and be able to reproduce Epigenetic Factors Are Heritable In humans, if a pregnant woman has a metabolic abnormality (ex. diabetes, high blood lipids), this can change the methylation pattern of some of her genes and some of her child’s genes o It appears that the child’s epigenetics are set as if the child’s body is expecting the child to experience the same nutritional balance (ex. excessively high sugar or fat levels) after birth as it does before birth o After the child is born and gets put on a more balanced nutritional regimen, its body does not metabolize these nutrients optimally It has long been known that in rats, stressing a pregnant female can change the birth weights and several behavioral/biochemical factors not only in her offspring, but also in her offsprings’ offspring, and possibly the generation after that as well Offspring of stressed females had abnormal behavioral and biochemical responses to stress More recent research has shown that there are changes in methylation of the glucocorticoid receptor gene in these offsprings’ brains The Inheritance Of Epigenetic Factors May Be Gender-Dependent In mouse experiments, feeding mice a high fat diet not only made those mice obese and insulin resistant (as in type 2 diabetes), but also caused their female, but not male, offspring to be obese An obese mother (normal weight father) had consistently obese daughters An obese father (normal weight mother) caused approx. 15% of his daughters to be obese Heritable Epigenetic Factors Underlie The Phenomenon Of Paramutation Paramutation = heritable changes in gene expression due to epigenetic effects o Ex. In corn, the Rr allele of a pigmentation gene causes the kernel to be purple; the Rst allele causes the kernels to be spotted o The Rst allele is dominant over the Rr allele—heterozygotes will have spotted kernels, because the Rst allele produces an interfering RNA that actually reduces expression of the Rr allele o Surprisingly, if an offspring inherits that Rr allele, without inheriting the Rst allele, the activity of the Rr allele is reduced in the offspring Can see spots on the kernel still although the Rst allele were there This effect diminishes over several generations o Para-mutation has been seen in mice as well o The Kit gene contributes to pigmentation;the wildtype Kit+ allele produces normal pigmentation; the Kitt allele is a recessive lethal allele; heterozygotes have white tails and feet The Kitt allele produces a microRNA that degrades the Kit mRNA o When you cross Kit+ Kit+ x Kit+ Kitt some of the homozygous Kit+ Kit+ offspring have white tails and feet, illustrating that the expression of the Kit+ allele has been changed, and this change is heritable, and will affect the offspring phenotype even in the absence of the Kitt allele o Injecting homozygous Kit+ Kit+ mice with RNAs from mice that had the heterozygous Kit+ Kitt genotype reduced the amount of Kit+ mRNA present in the cells, and caused the mice to have white tails and feet Injecting synthetic microRNAs that were designed to degrade the Kit+ mRNA into mice with the Kit+ Kit+ genotype cause the mice to have white tails and feet Specialized DNA Methyltransferases Maintain Epigenetic Changes Through DNA Replication There are a number of different versions of DNA methytransferase in the human proteome Some specifically recognize and methylate hemi-methylated DNA This methylates the newly synthesized DNA strand More Complex Organisms Have More Complex Gene Structure And The Ability To Produce Multiple Proteins From A Single Gene A gene that distributes its coding sequence across many exons can make several different versions, or isoforms, of its protein o The human genome contains ~23,000 genes, which make ~100,000 proteins; therefore the typical gene makes 4-5 different isoforms of its protein Some genes have multiple sites at which they cleave the mRNA, producing mRNAs that contain different sets of exons o Genes can also splice out one or more exons during RNA splicing The different isoforms can have the same active domains, but different regulatory domains— allowing different tissues to turn the protein on and off at different times The different isoforms may also have different activities and perform different functions Alternative Splicing Of The Transformer (Tra) Gene mRNA By The Sex-Lethal (Sxl) Protein Regulates Sex Determination In Drosophila Alternative Splicing Regulates Sex Determination In Drosophila RNA Must Be Stabilized RNAses degrade mRNAs, beginning at the polyA tail mRNAs must be stabilized if they are going to remain in the cytoplasm long enough to be translated o Poly-A binding proteins bind to the polyA tail and stabilize the mRNA, but once the poly-A tail shrinks to approx. 10-30 As, the poly-A binding proteins cannot bind to the poly-A tail any more, and the mRNA quickly degrades RNA Interference Regulates Many Genes There are several types of interfering RNAs (iRNAs), and they work through several different mechanisms—the best known are microRNAs (miRNAs) and small interfering RNAs (siRNAs) Their similarities are more important than their differ Both are formed when a longer, double-stranded RNA precursor is cleaved in the cytoplasm by the enzyme Dicer Both join with proteins to make up the RISC ( RNAinduced silencing complex) All interfering RNAs rely on having a base sequence that is complementary to some portion of either the mRNA’s sequence or the gene’s sequence, so the iRNA can bind to either the mRNA or the gene itself One Important Difference Between miRNAs And siRNAs siRNAs are exogenous double-stranded RNAs that are taken up by cells, or enter via vectors like viruses miRNAs are single stranded and endogenous— they are transcribed from sequences in the cell’s genome—some are found within the introns of genes o Some of what was once called nonfunctional “junk” DNA has turned out to include miRNAencoding sequences The first two mechanisms to be discovered for interfering RNAs’ actions involved interactions with the mRNA: o Inhibition of translation—If the iRNA can bind to some portion of the 5’ end of the mRNA, this will prevent the ribosome from reading the mRNA o Cleavage of the mRNA—The RISC complex includes endonucleases that cleave doublestranded RNAs If the iRNA binds a portion of the mRNA, the endonucleases will cleave the mRNA Inhibiting ribosome binding is especially effective if the iRNA can bind to the 5’ portion of the mRNA (2nd pic) The RISC complex includes endonucleases that can cleave the mRNA The early research studies all used concentrations of iRNAs that delivered hundreds or even thousands of copies of iRNA per cell Fire and Mello reported being able to completely abolish protein production with a concentration of iRNA that delivered approx. two molecules of iRNA per cell This led to the conclusion that some iRNAs must be able to bind to the gene itself and inhibit transcription There are many more than two copies of most genes’ mRNAs in a cell—two molecules of iRNA could never bind to all those molecules of mRNA Some iRNAs cause the gene to be methylated, inhibiting transcription Some iRNAs inhibit protein production by an unknown mechanism Long Noncoding RNAs Regulate Gene Expression We have already discussed how the XIST RNA silences activity of most of the genes on one of a female’s X chromosomes, by inducing methylation of the chromosome and supercondensation of the chromatin The lincRNA-p21 represses the action of the transcription factor p53, which itself regulates a number of genes whose proteins regulate the cell cycle and apoptosis o p53 dysfunction is frequently involved in human cancers Ubiquitin Tags Proteins For Degradation Ubiquitin brings proteins to the proteasome, to be degraded—the proteins that attach ubiquitin to specific target proteins can be increased or decreased, altering the lifespan of the protein Population Genetics Gene Alleles, Genotype And Phenotype The human DNA sequence is highly polymorphic--for any gene, there are many different specific versions of that gene’s sequence in the population Each of the different versions of the gene’s sequence is referred to as an allele of that gene Some alleles of a gene make a version, or isoform, of the protein that has a significantly higher or lower level of activity than the isoform made by most other people This is part of the reason why individuals within species are different from each other, and also part of the means by which new species evolve Genotype = the combination of gene alleles you possess for a given gene Heterozygous genotype = you have different versions of the gene’s sequence (two different alleles) in your two copies of the gene Homozygous genotype = you have the same version of the gene’s sequence (two copies of the same allele) in your two copies of the gene Phenotype = your observable traits Can be used to refer to: physical traits personality biochemical parameters (ex. blood sugar level) susceptibility to specific diseases response to specific drugs Your phenotype is determined by the interaction between your genotype and nongenetic factors that you encounter through your diet, environment and lifestyle o Ex. height is influenced by many genes, as well as childhood stress, nutrition, sleep and other factors Population Genetics Deals In Changing Allele Frequencies And Genotype Frequencies Study of the genetic composition of a population of individuals Study of the changes in allele and genotype frequencies that can happen in response to certain events/conditions Population = Group of individuals that reproduce sexually and interbreed within that population The population’s gene pool = all the different gene alleles that are present in the population Allele Frequencies And Genotype Frequencies Allele frequency = # of alleles of the type in question / total number of gene alleles Remember that a homozygous genotype contains two of that allele, while a heterozygous genotype contains only one of a given allele To calculate the frequency of an autosomal allele, or an X-linked allele in a female-only population, use the formula: f(A) = [f(AA) X 1] + [f(Aa) / 2] Genotype frequency = # of people with the genotype in question / total number of people (which is the same as the total number of genotypes) Remember that, for X-linked genes, a woman has two alleles per gene and a man has one (hemizygous), but both have only one genotype For Y-linked genes, only males have a genotype, which contains only one allele per gene (hemizygous) Population Genetics Deals In Changing Allele Frequencies And Genotype Frequencies Examples of forces that change allele/genotype frequencies: Unrelated outsiders migrating into a population Nonrandom mating within a population Gene Mutations Specific gene alleles being favored by chance Natural Selection Groups being forced to flee their homes by war or natural disasters Hardy-Weinberg Equilibrium Godfrey Hardy & Wilhelm Weinberg (1908) independently published mathematical theories of genetics that stated two principles: 1. That, under certain circumstances, the frequency of all gene alleles within a population will stabilize 2. That, when something changes the frequency distribution of alleles in a population, after one generation of random mating, you can predict the frequency of homozygous and heterozygous genotypes in the population if you know the frequencies of the specific alleles in question. Specifically, if there are only two alleles for a gene, and you let p = frequency of the A allele and q = frequency of the a allele, frequency of AA = p2 frequency of Aa = 2pq frequency of aa = q2 Note also that p2 + 2pq + q2 = 1 The Punnett square illustrates that HWE theory also applies if there are more than two alleles for a gene If the gene in question is X-linked, remember that males only have one allele Note that both of these principles rest on assumptions: The population is large The population mates randomly within itself (i.e. no artificial selection) No migration into the population by outsiders No genetic mutations No gene alleles get favored by chance There is no advantage or disadvantage in terms of genetic fitness in having one allele/genotype or another of a given gene It is the fact that the HWE law rests on assumptions that makes it interesting o When the population you are studying has a gene whose allele/genotype frequencies are not in Hardy-Weinberg equilibrium, it may mean that there is something interesting about that gene/allele/genotype, ex. one of the alleles/genotypes increases or decreases a person’s risk for disease, or otherwise increases/decreases genetic fitness in that environment Implications Of Hardy-Weinberg Equilibrium A population cannot evolve if it is in HWE, because its allele frequencies (and therefore genotype frequencies) will never change While sexual reproduction maintains genetic diversity, it is not sufficient to power evolution, because it doesn’t cause allele/genotype frequencies to change It takes mutations, migrations, selection or chance to drive evolution You can calculate allele frequencies even if there is allelic dominance—this is often used to estimate the carrier frequency for a recessive mutation in a population (let’s call the recessive disease-causing mutation the a allele) o Note that you can’t calculate allele frequencies from observing phenotypes, because AA and Aa produce the same phenotype If the population is in HWE, and you know the frequency of a recessive disease in the population, the frequency of affected people = q2, because the affected people have the aa genotype That gives you q, the frequency of the a allele Subtract 1-q to get p, the frequency of the A allele; p2 = freq(AA genotype) Carrier frequency = 2pq This can be used to calculate the probability that a child will be affected with a recessive disorder (if you know the frequency of the disease in the population) o Example: Father is a carrier of an autosomal recessive mutation. Mom hasn’t been tested, but is unaffected. What’s the probability the child will be affected? Testing For Hardy-Weinberg Equilibrium Comparing numbers I got with numbers I was expecting o Are they similar to ones I was expecting In order to test whether a population is in HWE or not, you perform a Chi square test, using HWE theory to provide you your expected numbers, and calculating degrees of freedom (df) as # possible different genotypes - # of different alleles for that gene o For a gene with two alleles, there are three possible genotypes, so 1 df o Example: Imagine that mutations in Gene A produce the autosomal recessive Syndrome A For this purpose, there are two types of alleles of Gene A in the population: Normal alleles (symbolized by A) Recessive disease-causing mutations (symbolized by a) o Imagine you determine everyone’s genotype for Gene A and report that, in a population of 10,000 people, you had 3,400 people with the AA genotype 5,900 people with the Aa genotype 700 people with the aa genotype Is the population in HWE for Gene A? The Critical Values Table tells you how big your value of chi square needs to be in order for you to declare your population out of HWE We will use 1 df and p = 0.05; our critical value is 3.84, much less than our obtained value of chi squared If Population not in equilibrium Less than never mounted to enough We therefore conclude that our population is not in HWE *DON’T NEED TO CALCULATE A PIE SQUARE Effects Of Genetic Drift Genetic drift refers to the possibility that the gametes that made the next generation just happened to have a disproportionate number that possess one particular allele of a gene The smaller the population, the more likely genetic drift is to happen—sampling error is always greatest when the population is small This can happen through a genetic bottleneck, in which a small sample of the population gets isolated (or survives a war) and makes up the original gene pool for the subsequent population The allele frequencies in this group will be determined by who survived the war—there was no natural selection involved This creates a founder effect, in which the original gene pool of the population was very limited in its diversity o Found a society and then come back 10000 years later and see high freq of alleles they had and low freq from society they split Genetic drift will increase the frequency of some alleles and decrease that of others—for any given allele, it can go either way o Genetic drift reduces genetic variability within the population— some alleles will become fixed (i.e. present at 100% frequency) o Genetic drift also causes different populations to become more different from each other over time o *Not net change over a great deal of time; Pop 2,3,4 Natural Selection Some genotypes produce phenotypic traits that allow the individual to thrive relative to his/her peers in that given environment In population genetics terms ( how many children you produce compared to contemporaries) , the key is genetic fitness—the ability to reproduce relative to your contemporaries—Remember it is relative: o If you have 5 children but everyone else in your generation has 1 child, your genes will make up a larger proportion of the next generation’s gene pool than they do in the present generation o If you have 5 children but everyone else in your generation has 10 children, your genes will make up a smaller proportion of the next generation’s gene pool than they do in the present generation The influence of natural selection on allele frequencies depends on the relative fitnesses of the heterozygous versus homozygous genotypes: o Over dominance—Heterozygotes have greater fitness than homozygotes; maintains the frequency of both alleles Maintains freq of both alleles (they remain in the pop) o Under dominance—Heterozygotes have lower fitness than homozygotes; directional selection occurs, where one allele’s frequency increases more than the other’s Degree of rise and fall depends on how diff the fitness between the two homozygotes *The degree to which the environment favors each of the two homozygous genotypes will determine which allele increases more in frequency and how quickly the allele frequencies change Genetic fitness (symbolized as W) ranges from 0 to 1.0 o The lower the better To calculate the genetic fitness of any given genotype, you take the average number of offspring produced by an individual with that genotype and divide it by the average number of offspring produced by an individual with the most prolific genotype When the environment selects for a genotype, it often also selects against the other genotypes Genetic fitness is related to the selection coefficient (symbolized as s), which reflects how severely a selection method works against a given genotype s = 1-W *Prolific: Produce a lot When Heterozygotes Have The Highest Fitness If you know the initial allele frequencies (and therefore the initial genotype frequencies) and the fitnesses of the genotypes, you can predict the effect of natural selection on the genotypes in the next generation using this formula: *Memorize these equations This formula enables you to calculate the frequencies of the homozygous and heterozygous genotypes in the next generation *Make sure: Next generation or equilibrium; Genotype frequency (genes) or allele frequency (mutation) To calculate what the effect of natural selection on allele and genotype frequencies would be after multiple generations (or when equilibrium is reestablished) using the W formula, you'd have to calculate the new p2, 2pq and q2 values for the next generation, and use those new frequencies to re-do the W formula to see what the frequencies would be in the generation after that, etc, until the frequencies stabilized Special Consideration For X-Linked Genes (Natural selection) Weinberg doesn’t work for males! For genes that lie on the X chromosome, because females have two alleles, Hardy-Weinberg theory applies to them o For example, f(affected female) = q2 for an X-linked recessive disease Because males only have one allele, however, HWE theory does not apply to them o Example—p(affected male) = q, since all he has to do to be affected is to inherit one mutant copy of the allele For X-linked genes, so the regular calculation to get q2 = f(affected females), then use that to get q = f(affected males) Sometimes you want to predict the allele and genotype frequencies that will be present after equilibrium is established If the heterozygous genotype has the lowest fitness, there will be directional selection—the mostfavored allele will eventually get fixed at a frequency = 1.0, so there won’t be any true equilibrium maintained If the heterozygous genotype has the highest fitness, both alleles will remain in the gene pool, and the allele frequencies will change in proportion to the fitnesses of the two homozygous genotypes: New f(Allele 2) at equilibrium = s11 / (s11 + s22 ) How much selection pressure it the environment putting on against the two homozygote genotype S11 what percentage of the pressure that the environment is getting put on the s11 *Remember, selection = selection against the allele/genotype, so the higher s11 is, the more A2 alleles you will end up with Recessive Disease Alleles Are Removed Very Slowly From The Population In this case, there is selection only against the homozygous recessive genotype W11 = W12 = 1.0 and W22 = 0 When the frequency of the recessive allele is high, and the frequency of homozygous recessive genotypes is high, the change in allele frequency from one generation to the next is big As the frequency of the recessive allele decreases, a greater proportion of the recessive alleles will be found in unaffected heterozygous carriers, and will not be selected against; the change in allele frequency from one generation to the next is small Negative And Positive Eugenics Negative eugenics = actively reducing the frequency of “bad” alleles o Hitler Positive eugenics = actively encouraging the propagation of “good” alleles o Mating (not random), sperm banks Negative eugenics was popular in America before World War II Several states sterilized cognitively handicapped individuals to keep them from passing down their “bad genes" Positive eugenics is practiced today: Selective mating--you select your mates on the basis of some characteristic(s) that you think is(are)important Sperm banks and egg banks = institutionalized positive eugenics Negative Eugenics Programs Are Often Mathematically Misguided Most genetic syndromes are recessive, and due to rare mutations In recessive disorders, the vast majority of the mutations are found in unaffected carriers A program based on sterilizing affected people only addresses a very small percentage of the mutations that exist in the population o Ex. Imagine an autosomal recessive mutation with frequency = q = 0.01(p = 0.99); in a population of 1,000,000 people (2,000,000 total alleles), there are 2,000,000 X 0.01 = 20,000 mutant alleles: Truly Effective Negative Eugenics Programs Violate Many People’s Ethical Standards In order to eradicate the mutation from the gene pool, you would have to sterilize the unaffected parents, (also siblings of affected) who are not only unaffected, but have a ¾ probability of producing an unaffected child during their next pregnancy NM X NM à ¼ NN, ½ NM, ¼ MM You would also have to sterilize some of the unaffected relatives o If you are affected with the disease, for each brother or sister you have, there is a 2/3 chance the sibling has the mutation as well o Your siblings = 1NN : 2 NM Effects Of Nonrandom Mating Nonrandom mating can take one of several forms: o Positive assortative mating = mating with people who have characteristics similar to yours o Negative assortative mating = mating with people who have characteristics different from yours Assortative mating, i.e. choosing people based on their phenotypes, only affects the allele frequencies of the genes that influence those traits and genes that are linked to those genes Effects Of Inbreeding Inbreeding is a form of assortative mating—positive assortative mating for relatedness (opposite = outcrossing or outbreeding) Unlike matings that are based on specific phenotypic features, inbreeding affects all genes’ alleles, not just a few genes’ alleles Inbreeding increases the proportion of homozygotes in the population o This can increase the frequency of genetic disorders, because healthy people carry 50-100 recessive disease-causing mutations in their genome, which will cause a disease if the child inherits the mutation from both parents The degree of inbreeding is represented by the inbreeding coefficient, which reflects the proportion of that individual’s homozygous genotypes in which the two alleles are derived from a common ancestor In order to do this, you need to differentiate between a situation in which the two identical alleles in the homozygous genotype came down from different ancestors (identical by state) versus a situation in which the two alleles came down from a common ancestor (identical by descent) *UNDERSTANT Identical by decent and pedigree Ron’s parents are second cousins Both copies of the A1 allele came down from his great-great grandmother—they are identical by descent Ron’s parents are unrelated One copy of the A1 allele came down from each of his great-great grandfathers—they are identical by state (coincidently) The inbreeding coefficient (symbolized as F) can range from 0 to 1.0 o The higher the number, more homozygous When inbreeding takes place for many generations, the proportion of heterozygotes is eventually (at equilibrium) reduced by 2Fpq, and the proportion of each homozygous genotype eventually increases by Fpq Most extreme example: Self-fertilization, therefore F = 1 Homozygotes produce more homozygotes (AA X AA à AA; aa X aa à aa) Heterozygotes produce 1 AA : 2 Aa : 1 aa, or half their offspring are heterozygotes—Therefore, the proportion of heterozygotes gets cut in half each generation, and the proportion of each of the two homozygous genotypes rises by half that amount If F < 1.0, genotype frequencies change more slowly The more closely related the parents are, the faster inbreeding increases the proportion of homozygotes o Eventually everyone is homozygous Example: Recessive mutation with allele frequency = q = 0.02 (2%) Therefore p = 0.98 If the population mates at random (F = 0), q2 = 0.0004, i.e. 4 in 10,000 people are affected Brother-sister matings (F = 0.25) will bring the incidence of the disorder (i.e. the frequency of the aa genotype) up to q2 + Fpq = 0.0004 + (0.25 X 0.98 X 0.02) = 53 in 10,000 people are affected Inbreeding depression = The increase in homozygous genotypes and therefore recessive disorders and sub-standard traits caused by inbreeding Inbreeding Has Worked For A Few Species Some species (ex. the garden slug) regularly inbreed—they have eliminated many of the deleterious alleles from their population because homozygotes for the “good” alleles have a significant mating advantage o Got rid of bad alleles early on Effects Of Gene Mutations There is a constant flow of mutations and reverse mutations Consider one gene for which there has been a mutation (creating allele A2)—It is possible for there to be a reverse mutation (back to A1) The more A1’s there are, the more A1’s are available to mutate to A2’s As the frequency of A1’s drops, the frequency of A2’s rises, but now there will be fewer A1’s available to mutate into new A2’s, so the increase in the frequency of A2’s slows down The higher the frequency of A2’s gets, the more A2’s are available to reverse mutate back to A1’s Eventually the rate of forward and reverse mutations equalizes and allele frequencies stabilize If μ = rate of forward mutation and ν = rate of reverse mutation, the frequency of the mutant allele at equilibrium = q = μ / (μ + ν o i.e. dependent solely on the two mutation rates Effects Of Migration (aka Gene Flow) Migration between two populations keeps genetic variability high within them both, and makes them more alike than they would be otherwise o Example: Imagine two populations, where members of population 1 migrate to join population 2 each generation At first, let freq(a) = q1 in population 1; freq (a) = q2 in population 2 After migration, if m = proportion of people in location 2 that have migrated in, the freq(a) reflects the proportion of migrants (m) versus residents (1-m) in location 2 freq(a) = q1(m) + q2(1-m) The freq(a) has changed by m(q1 - q2) Eventually the freq(a) becomes equal in both populations