Basic Concepts of Molecular Biology

advertisement

Basic Concepts of Molecular Biology

*from a bioinfo. point of view... To be recalled to those trained in exact sciences: nothing is 100%true Contents: 1. Life 2. Proteins 3. Nucleic Acids 4. Molecular Genetics (mechanisms). 5. How the Genome is studied 6. Sequence Databases 1

1. Life

Living organism: due to complex array of chemical reactions, exchanges constantly matter and energy with its surroundings, no deadlock. The main actors in the chemistry of life(biochemistry) are molecules called proteins ("we are our proteins") and nucleic acids. Molecular biology research: basically devoted to the understanding of the structure and functions of proteins and nucleic acids. 2

2. Proteins

Most substances in our bodies are proteins: -structural proteins act as tissue building blocks -enzymes: act as catalyst of chemical reactions A protein is a chain of amino acids. It has a central carbon to which is attached : - a H -an amino group NH 2 - a carboxy group COOH -a side chain particular to each amino acid 3

4

5

Protein (cont.1)

In a protein, amino acids are joined y peptide bonds: the C belonging to COOH of protein A i bonds to the NH of protein A i+1 and a water molecule is liberated. What we really find inside a polypeptide chain is a residue of the original amino acid. Thus, we generally speak of proteins having 10 (typivally from 300 to 5000) residues. Repetition of blocks -N-C α -CO- is the backbone . The convention is that a polypeptide chain begins at the amino group (N-terminal) and ends at the carboxy group (C-terminal). 6

7

Protein(cont.2)

Structures: The primary structure is the sequence of the residues. The secondary structure takes in account local interactions between atoms of the backbone (α_helix, β-sheet, loops). The tertiary structure expresses the folding in 3D. Quaternary structure: group of different proteins packed together. 8

9

Proteins (cont.3)

Finding protein folding is one of the main research area in molecular biology (the "Graal").Values of all pairs of angles φ (between the Cα atom and the N atom) and ψ (between the Cα atom and the other C atom) for the different amino acids would give exact structure. Very difficult problem. The three dimensional form of a protein is related to its function. A folded protein has varied nooks and bulges to bind to other molecules to build group or exchange atoms. Proteins are produced in a cell structure called ribosome where the amino acids are assembled one by one from an important molecule called messenger ribonucleic acid 10

Nucleic Acids

Living organisms contains two kinds of nucleic acids: - ribonucleic acid (RNA) - deoxyribonucleic acid (DNA)

DNA

- double chain with two strands. - backbone is formed by a sugar molecule (2'-deoxyribose) attached to a phosphate residue. - orientation: carbon atoms are labeled 1' to 5'. The basic bond of the backbone is : 3' carbon -phosphate residue- 5' carbon. By convention, a strand begins at the 5' end and finishes at the 5' end. 11

12

DNA (cont.1)

- orientation: carbon atoms are labeled 1' to 5'. The basic bond of the backbone is : 3' carbon -phosphate residue- 5' carbon. By convention, a strand begins at the 5' end and finishes at the 5' end. - To each 1' carbon is attached a base: adenine A, guanine G,(they are purines), cytosine C, thymine T (they are pyrimidines). - A nucleotide is a set sugar + phosphate + base -An oligonucleotide is a DNA molecule having a few (ten of) nucleotides. 13

14

15

DNA(cont.2)

- DNA molecules are double strands which are tied together in a helix structure (James Watson and Francis Crick, 1953). -A (resp. C) is the complement of T(resp. G). Unit of length: bp (base pair) -The two strands are antiparallel. one can be deuced from the other by reverse complementation Example: s = AGACGT, s' = TGCAGA (reverse) ś = AGACGT (reverse complement) 16

17

RNA

Differences with RNA - Sugar is ribose instead of 2'deoxyribose. - Instead of T, one finds U (uracil) which binds also with adenine. - RNA does not form a double helix. It may have a far more varied three dimensional structure. -They are different kinds of RNA which perform different functions. 18

4. The Mechanisms of Molecular Genetics

19

Genes and the genetic code

- Chromosome: long DNA molecule which contains coding parts which contains genes which code for proteins. - Each amino acid is specified by a codon, a triplet of nucleotide. The correspondence between each triplet (using RNA) and each amino acid is given by the genetic code : - There are 64 possible triplets, but only 20 amino acids. - Several codons can code for one amino acid (ie. AAG and AAA for lysine) - Three codons STOP are used to signal the end of a gene. 20

Transcription

Produces RNA from DNA by the mean of the RNA polymerase: mRNA (for messenger RNA) from a gene , or rNA (ribosomal RNA) or tRNA (transfert RNA). - the RNA polymerase recognizes the beginning of a gene (or of a gene cluster) thanks to a promoter (TATA box is the best known of them) which is situated upstream (before the START codon AUG). Termination is not well-known (polyadenization). - the template strand is the one that is transcribed (mRNA is composed by binding together ribonucleotides complementary to this strand). 21

22

Transcription (cont.1)

-for eukariotes (organisms whose cells have a nucleus), the mechanism is more complex than for (cells without a nucleus, like bacteria). Genes can contain alternating parts, called exons and introns (which are not transcripted). Splicing (which removes introns from the primary transcript) is done in the nucleus and delivers(outside the nucleus) the mRNA. Alternative splicing (same DNA can give rise to two or more different mRNA by choosing introns and exons in a different way) may also occur... - One distinguishes genomic DNA (gene as found in the chromosom) from complementary DNA (cDNA, sequence consisting of exons only). cDNA can be produce from RNA by reverse transcription (EST, Expressed Sequence TAG). 23

Translation

Produces a protein from a mRNA by using a ribosome which make use of tRNA to construct an amino acid from a codon. -initiated when one of the tRNAs of the ribosome binds to a particular sequence (more or less the Shine-Delgarno sequence ATTCCTCCA) in the RNA. -the first codon to be translated is AUG -there are not as many tRNAs as there are codons. Their number varies among species (40 for bacterium E.Coli). 24

25

Junk DNA and Reading Frame

Junk DNA: intergenic regions between coding parts. Prokatiotes have little of it, eukariotes quite more (more than 90% for the human genome). Reading frame : one of the three possible ways of grouping bases to form a RNA sequence. Example: TAATCGAATGGC has the three following frames: [TAA, TCG, AAT,...], [AAT, CGA, ...], [ATC, GAA,...] - 6 reading frames have to be considered if one wants to translate a DNA sequence into a (supposed) protein, because of the two strands. Open Reading Frame (ORF) in a DNA sequence: a subsequence beginning at a start codon, having an integral number of codons, non of which being a STOP codon. 26

Chromosomes

Genome : complete set of chromosomes of an organism. Prokariotes have usually one chromosome (sometime circular). In eukariotes, chromosomes appear in pair (23 for humans, the cells containing them are called diploid). The two chromosomes of a pair are said homologous. Genes which appear differently in the two chromosomes are alleles. Cells which carry only one member of each pair are haploid (these used in sexual reproduction) formed through the process of meiosis. - not all genes are expressed by a specific cell. 27

Genome size of "important" species

Bacteriophage λ (virus)

Escherichia Coli

Saccharomyces cerevisaie (yeast) 1 chr. 1 Caenorhabditis elegans (worm) 32 12 Drosophila melanogaster (fruit fly) 8 Homo sapiens (human) 46 5*10 4 bp 5*10 6 1*10 7 5*10 8 2*10 8 5*10 9 28

Download