Module 1 Introduction to Bioinformatics Sevas Educational Society All Rights Reserved, 2008 What is Life made of? Sevas Educational Society All Rights Reserved, 2008 Life begins with Cell • A cell is a smallest structural unit of an organism that is capable of independent functioning • All cells have some common features Sevas Educational Society All Rights Reserved, 2008 All Life depends on 3 critical molecules • DNAs • RNAs • Proteins Sevas Educational Society All Rights Reserved, 2008 DNA: The Code of Life • Adenine (A), Guanine (G), Thymine (T), and Cytosine (C) which pair A-T and C-G to form DNA and these molecules are called as nucleic acids. See Next Slide for More about DNA…. Sevas Educational Society All Rights Reserved, 2008 DNA: The Basis of Life • Deoxyribonucleic Acid (DNA) – Double stranded with two strands A-T, C-G • DNA is a polymer – Sugar-Phosphate-Base – Bases held together by Hydrogen bonding to the opposite strand Sevas Educational Society All Rights Reserved, 2008 DNA, continued • DNA has a double helix structure which composed of – sugar molecule – phosphate group – and a base (A,C,G,T) A - Adenine C - Cytosine G - Guanine T - Thymine Sevas Educational Society All Rights Reserved, 2008 DNA: The basis of Life DNA is Present in Chromosome The figure shows how DNA is packed in chromosome…. Sevas Educational Society All Rights Reserved, 2008 Central Dogma of Biology The information for making proteins is stored in DNA. There is a process (transcription and translation) by which DNA is converted to protein. DNA Transcription (the process of formation of RNA from DNA) RNA Translation (the process of formation of protein from RNA) Protein Sevas Educational Society All Rights Reserved, 2008 RNA • RNA is similar to DNA chemically. It is usually only a single strand. DNA and RNA are similar in structure with one difference. DNA consists of Adenine (A), Guanine (G), Thymine (T), and Cytosine (C) and RNA consists of Uracil (U), adenine (A), Guanine (G) & Cytosine (C). Only difference is T(hyamine) is replaced by U(racil) and it is single strand & DNA is double strand. Sevas Educational Society All Rights Reserved, 2008 Several types of RNA exists as mRNA, tRNA & rRNA and each has its own significance. Definition of a Gene The important part of DNA, which is responsible for “RNA and protein” formation is called “Gene” and Gene consists of two parts 1. Exons: 2. Introns: Sevas Educational Society All Rights Reserved, 2008 Formation of RNA from a gene (gene is always present in DNA) The process of breaking all introns to form mRNA is called “splicing” & the most important region in mRNA which is reponsible for protein formation is called “Open Reading Frame”. Sevas Educational Society All Rights Reserved, 2008 Protein: Complex organic molecules made up of amino acid subunits. Protein is formed from 20 different kinds of amino acids. Each has a 1 and 3 letter abbreviation. Protein is also called as “polypeptide” Sevas Educational Society All Rights Reserved, 2008 Combination of two or more amino acids can be called as proteins or poly peptides. Sevas Educational Society All Rights Reserved, 2008 RNA Protein: Translation mRNA (messenger RNA) passes through ribosome (A small organ in a cell) and three nucleic acids present in mRNA will produce one amino acid and those three nucleic acids are called codons. In the figure you can see UUC, GGA etc…. are coding for group of amino acids. Codon table is provided in the next slide which tells which codon codes for which proteins. Sevas Educational Society All Rights Reserved, 2008 RNA Protein: Translation Protein is formed from RNA and three molecules in RNA, called as codons, are responsible for producing one amino acid present in a protein. As the codon number increases the formation of bigger proteins takes place. RNA codon ----- Amino Acid UUA ----- Leucine CGA ---- Arginine UAG --- Stop Codon or no amino acid UGG -- Tryptophan. Sevas Educational Society All Rights Reserved, 2008 For more explanation (How protein is formed from mRNA) tRNA enters into ribosome by carriing one amino acid (in this case it is carrying valine) mRNA As the amino acids number increases the formation of protein takes place. (In the figure you can observe a protein consisting eight amino acids) Sevas Educational Society All Rights Reserved, 2008 Proteins are represented by single letter alphabets... (by using the table present in 14th slide) Write down the name of corresponding amino acids AGASPFMKLKKAGAKAHLKMSHFWYVHSIL Example: A – alanine G - Glycine Sevas Educational Society All Rights Reserved, 2008 DNA consists of four nucleotides Adenine, guanine, cytosine and thymine (and they are represented by single letter alphabets. Adenine ------Guanine ------Cytosine ------Thymine ------- A G C T Write down the names of nucleotides present in DNA “AAAGAGACGTACGACGAGCGC” Like: Adenine-Adenine-Adenine-Guanine-……….. Sevas Educational Society All Rights Reserved, 2008 We all know that three Nucleotides codes for single amino acid then for example like "UUU" codon produces "Phenylalanine Amino Acid“. By using the above table find the protein sequence of the following RNA starting from first "A" only. AAGAGARGACGUGCGACGACGUCGAGUCAAAAACGUCGA Clue: AAG ---- See the corresponding amino acid in the above table it is “Lys” or “K”…. Like wise select three codons and get amino acids Sevas Educational Society All Rights Reserved, 2008 Bioinformatics: Bioinformatics is generally defined as the analysis, prediction, and modeling of biological data with the help of computers What is biological data and how to put protein into computer? Biological data is nothing but protein and nucleic acid sequences represented by alphabets. Example: Protein Sequence (consists twenty amino acids) AAGHWTILKWGRSH DNA sequence (Consists four nuclic acids) AAGAGTCGCGAGAGGACG Sevas Educational Society All Rights Reserved, 2008 What is Computational Biology? This branch of biology involves the use of techniques including applied mathematics, informatics, statistics, computer science, artificial intelligence, chemistry, and biochemistry to solve biological problems usually at the molecular level. Sevas Educational Society All Rights Reserved, 2008 Bioinformatics is multidisciplinary Mathematics/ computer science Genomics Molecular biology Bioinformatics Biophysics Biomedicine Ethical, legal, and social implications Molecular evolution Sevas Educational Society All Rights Reserved, 2008 From chromosomes to sequence data Large scale DNA sequencing CGCCAGCTGGACGGGCACACCATGAGGCTGCTGACCCTCCTGGGCCTTCTG TGTGGCTCGGTGGCCACCCCCTTAGGCCCGAAGTGGCCTGAACCTGTGTTC GGGCGCCTGGCATCCCCCGGCTTTCCAGGGGAGTATGCCAATGACCAGGAG CGGCGCTGGACCCTGACTGCACCCCCCGGCTACCGCCTGCGCCTCTACTTC ACCCACTTCGACCTGGAGCTCTCCCACCTCTGCGAGTACGACTTCGTCAAG Sevas Educational Society All Rights Reserved, 2008 Some Terminology…….. Genome: In biology the genome of an organism is its whole hereditary information and is encoded in the DNA (or, for some viruses, RNA). Genomes can be represented as base pairs (AT or CG) of nucleic acids. Human Genome consists of 3 Billion Base Pairs. etc……. Genomics is the study of an organism's entire genome. The field includes intensive efforts to determine the entire DNA sequence of organisms. Sevas Educational Society All Rights Reserved, 2008 Genome sequencing and analysis (genomics) Genomics generates a vast amount of DNA sequence data. Sophisticated algorithms are used to predict gene regions. Only ~3% of the vertebrate genome codes for proteins. Genbank hold sequences from over 800 organisms. There are currently 113 complete genomes. The completion of a "working draft" of the human genome was announced in June 2001. •Estimates of 38 - 120,000 genes (40, 000) Sevas Educational Society All Rights Reserved, 2008 Genome Size as in base pairs…. ORGANISM CHROMOSOMES GENOME SIZE GENES Homo sapiens (Humans) 23 3,200,000,000 ~ 30,000 Mus musculus (Mouse) 20 2,600,000,000 ~30,000 Drosophila melanogaster (Fruit Fly) 4 180,000,000 ~18,000 Saccharomyces cerevisiae (Yeast) 16 14,000,000 ~6,000 Zea mays (Corn) 10 2,400,000,000 ??? Sevas Educational Society All Rights Reserved, 2008 1. Draw the figure present in sixth slide? 2. Draw the table giving proteins (14th slide) 3. Search what is tRNA in google.com 4. Literature Collection: Type www.google.com in the web site. Type NCBI home page in the google Go to home page of NCBI Click on pubmed in the search space and type any key word you are interested in to look for, for eg., lung cancer, human genome project, polymerase, immunoglobins etc. Window will display your result Click on each of the result to read abstract or full text of your interest. 5. Go to www.answers.com and type transcription and see the explanation. 6. Go to google.com and type “ppt introduction to bioinformatics” and collect more than five powerpoint presentation. (if you add ppt to any keywork you will get PowerPoint presentations) Sevas Educational Society All Rights Reserved, 2008 Reference: 1. http://en.wikipedia.org/wiki/Bioinformatics 2. PowerPoint Slides from internet http://www.bioalgorithms.info/presentations.old/Ch03_Molecula r_Biology_Primer.ppt Sevas Educational Society All Rights Reserved, 2008