gene expression… from DNA to protein biology 1 • Genes control metabolism • Gene expression is a two stage process – transcription – translation • Genes consists of triplets of nucleotides - the genetic code • Protein synthesis in prokaryotes and eukaryotes – Eukaryotic modification of RNA • Mutations Genes control metabolism • One gene-One polypeptide rule • Polypeptides that are constructed as a result of transcription/translation process become either – structural proteins – enzymes • Those proteins that have quaternary structure may have polypeptides originating from different genes The transcription/translation process • Transcription: DNA codes for the construction of mRNA • Translation: mRNA is read by rRNA at a ribosome; tRNA brings amino acids to ribosome as defined by code on mRNA • Ribosome assembles polypeptide Recap on RNA - a ribose nucleic acid that uses Uracil (U) in place of Thymine (T) The genetic code • The linear sequence of nucleotides in DNA ultimately determines the linear sequence of amino acids in a polypeptide • There are approximately 20 types of amino acid to choose from • In DNA, the four nucleotides are ATCG • Therefore, the sequence of four possible nucleotides must code for 20 amino acids – If DNA used a individual nucleotide to refer to an individual amino acid, this system would only code for 41 amino acids – Using two nucleotides would account for 42 = 16 amino acids – Using three nucleotides would account for 43 = 64 amino acids • Since there are only 20 amino acids, yet 64 possible codes, some redundancy occurs • Each block of three nucleotides, ultimately corresponding to a particular amino acid, is called a codon • In the first stage of the gene expression process, transcription, the information in the codons of a gene are transferred to mRNA • This process is via an RNA polymerase that uses one of the DNA strands of the double helix (the template strand) • For each amino acid, there are generally several codons possible. Also, some codons have a non-amino acid equivalent, but instead send specific messages to RNA polymerase (start/stop) Transcription • Three phases – Polymerase binding and initiation – Elongation – Termination • In eukaryotes, RNA polymerase II bind to specific regions on DNA called promoters • Promoters are typically 100 nucleotides long, including – The initiation site, where transcription begins – Nucleotides sequences that help initiate transcription • Initiation in eukaryotes requires transcription factors, DNA-binding proteins that bind to specific nucleotide sequences in the promoter region – A common place for a transcription factor to bind is the TATA box – RNA polymerase recognizes the promoter site once DNA and transcription factor have bound at the TATA box • RNA polymerase temporarily separates the double helix for transcription • In elongation, RNA polymerase II (eukaryotes) – Untwists the DNA molecule – Adds incoming RNA free-floating nucleotides to the 3’ end of the RNA strand (grows 5’ to 3’) • mRNA grows at 30-60 nucleotides/sec. The mRNA chain starts to peel away as the double helix reforms – Followed in series, several molecules of RNA polymerase can simultaneously transcribe the same gene • Transcription proceeds until the polymerase reaches a termination code Translation • During translation, proteins are synthesized according to a genetic message of sequential codons along mRNA • tRNA (transfer RNA) interprets between the base sequence in mRNA and the amino acid sequence in a polypeptide chain. To do this… – Transfer amino acids from cytoplasm to ribosome – Recognize the correct codons on mRNA • Molecules of tRNA are specific to one particular amino acid – One end of tRNA attaches to a specific amino acid (3’ end) – The other end attaches to an mRNA codon by base pairing with its anti-codon • An anti-codon is a nucleotide triplet in tRNA • tRNA decodes the genetic message codon by codon • There are 45 types of tRNA, which is sufficient for the 64 codes, since there is a relaxation of base-pairing on the third nucleotide (wobble) – e.g., U in 3rd position of anticodon can bind with A or G on the equivalent codon – In some cases, third position on a tRNA anticodon is occupied by Inosine (a sixth nucleotide) that can bind with U, C or A • Joining of tRNA to specific amino acid at the 3’ end is by Aminoacyl-tRNA synthetase • Each amino acid has a particular synthetase enzyme – ATP activates the amino acid by losing 2 phosphate groups, and joining to the amino acid as AMP – tRNA bonds to the amino acid, which loses AMP • Ribosomes coordinate the pairing of tRNA anticodons to mRNA codons – Consist of 2 subunits (small and large) that remain separated when not involved in protein synthesis – Ribosomes are composed of 60% rRNA and 40% protein • In addition to an mRNA binding site, two further sites on a ribosome are the Pand A-sites – P-site holds the tRNA carrying the growing polypeptide chain – A-site holds the tRNA that has the next amino acid in the polypeptide sequence • Building of a polypeptide chain consists of three steps – Initiation – Elongation – Termination Translation Initiation • In eukaryotes, the small ribosomal unit binds to an initiator tRNA (methionine; anticodon UAC) • The small ribosomal unit binds to the 5’ end of mRNA, and in doing so brings the tRNA anticodon in close proximity with mRNA methionine codon • This binding requires initiation factors • Finally, the large subunit binds to the complex – The initiator tRNA fits to the p-site of the ribosome – The vacant a-site is ready for the next aminoacyltRNA complex Translation elongation • Codon recognition—mRNA codon in the asite of the ribosome forms hydrogen bonds with anti-codon of an entering tRNA carrying the next amino acid in the chain • Peptide bond formation—The enzyme peptidyl transferase (part of the large ribosomal unit) catalyzes the peptide bond between the incoming amino acid and the growing polypeptide chain • Translocation—the tRNA in the p-site releases from the ribosome, and the tRNA in the a-site moves into the vacated site Translation termination • A termination codon signals the end of translation; by binding to a protein release factor, this causes: – Peptidyl transferase hydrolyzes the bond between the completed polypeptide and the tRNA in the p-site – This frees the polypeptide and tRNA so that they can release from the ribosome – The two ribosomal units disassociate – mRNA may continue to be translated by polyribosomes Differences between prokaryotic and eukaryotic gene expression • Lack of nuclear membrane in prokaryotes means that transcription can occur at one end of the mRNA molecule, while translation can be occurring at the other end • In eukaryotes, RNA is modified following transcription before translation – 5’ cap added (modified guanine nucleotide – Poly-A tail added (200 adenine nucleotides) to 3’ end – These ends might protect mRNA sequence (attaching to untranslated leader and trailer sequences respectively) • Gene splicing Gene splicing • Eukaryotic mRNA has segments of non-code, called introns (code sequences called exons) – Introns and exons are initially coded into one long strand called hnRNA (heterogenous RNA) – In RNA splicing, introns are removed from hnRNA to make mRNA • Process of splicing mRNA involves SnRNPs (“snurps”) - small nuclear ribonucleoproteins, that are composed of SnRNA (small nuclear RNA) and proteins – Together with extra proteins, SnRNPs form complexes called spliceosomes, which excise introns (SnRNPs attach to either end of each intron) – tRNA and rRNA also need to be spliced, but different agents do the splicing - ribozymes, RNA molecules that act as enzymes (note: thus not all enzymes are proteins) • Why do introns exist? – May regulate gene activity – Splicing may regulate export of mRNA to cytoplasm – Introns cause exons to be further apart, and therefore to be further away from each other on the chromosome: this could mean a higher probability of recombination during cross-over – Specific introns may code for specific domains within a protein When things go wrong... • Mutation = a permanent change in DNA that can involve large chromosomal regions or a single nucleotide pair • Point mutation = a mutation limited to one or two nucleotides in a single gene – Base-pair substitution • Missense mutation • Nonsense mutation – Insertion/deletion mutations • Base-pair substitutions generally have no effect if they occur on the third nucleotide of a triplet – If they do change the amino acid, one a.a. substitution may not radically affect the functionality of the final polypeptide – In some cases, functionality is improved: in most cases, functionality is impaired – In nonsense mutations, the substitutions causes a triplet to read STOP, abruptly terminating polypeptide chain. Such mutations are usually harmful • Insertions or deletions add or remove one or more nucleotides from a sequence – Since a reading frame for nucleotides is based on a series of three, insertions and deletions that add or remove a sequence of nucleotides not divisible by 3 can substantially alter the final polypeptide – Such a mutation is referred to as a frameshift - these mutations usually result in non-functional proteins, unless they occur towards the end of a sequence