WHAT IS BIOINFORMATICS? Daniel Svozil, Laboratoř informatiky a chemie svozild@vscht.cz http://ich.vscht.cz/~svozil Canceled lecture • Wed, 9. 3. 2016, lecture is canceled Studijní materiály • http://ich.vscht.cz/~svozil/teaching.html Coursera • MOOC • www.coursera.org • Bioinformatic Methods I, II • Bioinformatics: Life Sciences on Your Computer • Bioinformatics Algorithms 1, 2 • Algorithms, Biology, and Programming for Beginners • Computational Molecular Evolution edX • www.edx.org • Data Analysis for Genomics studuj.bioinformatiku.cz Definition • NCBI • Bioinformatics is the field of science in which biology, computer science, and information technology merge into a single discipline. The ultimate goal of the field is to enable the discovery of new biological insights and to create a global perspective from which unifying principles in biology can be discerned. • Wikipedia.org • The application of information technology and statistics to the field of molecular biology. • The creation and advancement of databases, algorithms, computational and statistical techniques, and theory to solve formal and practical problems arising from the management, analysis and interpretation of biological data. http://www.ncbi.nlm.nih.gov/About/primer/bioinformatics.html Extraction of biological knowledge from data convert data to knowledge generate new hypotheses Experimental Data Knowledge From public databases design new experiments Omes genome – DNA sequence in an organism transcriptome – mRNA of an entire organism proteome – all proteins in an organism metabolome – all metabolites in an organism interactome – all molecular interactions in an organism Organism Cell Tissue architectures Genome Transcriptome Proteome Metabolome Reactome Cell interactions Sigaling …… Omes and Omics • Genomics • Primarily sequences (DNA and RNA) • Databanks and search algorithms • Supports studies of molecular evolution • Proteomics • Sequences (Protein) and structures • Mass spectrometry, X-ray crystallography • Databanks, knowledge bases, visualization • Functional Genomics (transcriptomics) • Microarray data • Databanks, analysis tools, controlled terminologies • Systems Biology (metabolomics) • Metabolites and interacting systems (interactomics) • Graphs, visualization, modeling, networks of entities “Omics” includes Advanced pre-processing techniques Reliable highthroughput information Genomics Transcriptomics Proteomics Metabolomics Interactomics …… measured by To reduce noise High-throughput High-noise Techniques to analyze high-dimensional data and knowledge bases Sequencing Microarrays LC/MS NMR Two hybrid …… these data are Biological knowledge Medical knowledge Improved health source: Bios 560R Introduction to Bioinformatics, userwww.service.emory.edu/~tyu8/560R/560R_1.pptx Key reasearch in bioinformatics • sequence bioinformatics • structural bioinformatics • systems biology • analysis of biological pathways to gain e.g. the understanding of disease processes 21st century – complex systems • Designing (forward-engineering) • Why is it so complex? • Understanding (reverse-engineering) • Can we make a sense of this • Fixing complexity? • How is it robust? http://yilab.bio.uci.edu/ICSB2007_Tutorial_AM1.htm CELL BIOLOGY Daniel Svozil Molecular biology • Though all aspects of biology can be studied at the molecular level, molecular biology is usually restricted to the molecules of genes/gene products/heredity – molecular genetics • Experiments in molecular biology are done using model organisms • Two classes of organism • Prokaryotes • Eukaryotes Prokaryotes vs. Eukaryotes • plasma membrane • nucleus • organelles bacteria • 1 bacteria = 1 cell • lower organisms • Escherichia coli (E. coli) Cells in eukaryotes • body (somatic) cells • differentiated into special cell types (brain cells, liver cells …) • produce by simple cell division – mitosis • sex cells (gametes) • egg, sperm • used for sexual reproduction (only eukaryotes) • meiosis – reduction of the amount of genetic material Eukaryotic chromosomes • Threadlike DNA, carries genes • Each organism has specific number of chromosomes • Sex chromosomes (determine gender – XX (female), XY (male)), autosomal chromosomes • 46 in human, 2 sex, 44 autosomal • Come in pairs (two in a pair have the same shape and same set of genes (but different alleles)), homologs, diploid Cell cycle • Division of the cell in two exact copies. homologous chromosomes homologous chromosomes copied Genetics for Dummies, Tara Robinson http://www.bothbrainsandbeauty.com/wp-content/uploads/2009/11/chromosomes.jpg Karyotype Genetics for Dummies, Tara Robinson Mitosis 2n diploid (2n) mother cell DNA synthesis 4n division 2n 2n identical diploid (2n) daughter cells Sexual reproduction • Egg gets fertilized by sperm. Zygote is cretaed. • Zygote is diploid (divides by mitosis), thus the gametes must be haploid! • In organism with diploid cells, how do you get haploid? • Meiosis (another type of cell division) Meiosis • The result of meiosis is a haploid cell. • From one parent diploid cell you get four haploid cells. In addition, homologous chromosomes go through recombination. http://www.britannica.com DNA – The Basis of Life DNA • Biomacromolecule • Consists of repeating units • DNA in organism does not usually exist in one piece • chromosomes Deconstructing DNA • http://www.umass.edu/molvis/tutorials/dna/ • bases, deoxyribose sugar, phosphate – nucleotide • Bases are flat → stacking • pYrimidines – C, T • puRines – A, G Nucleoside base O5‘ C5‘ sugar C3‘ O3‘ Nucleotide • nucleosides are interconnected by phospohodiester bond • nucleotide monophosphate nucleoside Bases complement each other. Chargaffs’ rules • amount of G = C • amount of A = T DNA conformations A B Z B-DNA A-DNA Z-DNA Biological role of different DNAs • B-DNA • canonical DNA • predominant • A-DNA • Conditions of lower humidity, common in crystallographic experiments. However, they’re artificial. • In vivo – local conformations induced e.g. by interaction with proteins. • Z-DNA • No definite biological significance found up to now. • It is commonly believed to provide torsional strain relief (supercoiling) while DNA transcription occurs. • The potential to form a Z-DNA structure also correlates with regions of active transcription. Different sets of DNA • nuclear DNA • cell’s nucleus • majority of functions cell carries out • sequencing the genome – scientists mean nuclear DNA • mitochondrial DNA • mtDNA • circular, in human very short (17 kbp) with 37 genes (controling cellular metabolism) • all mtDNA comes from mom, no recombination - Mitochondrial Eve • chloroplast DNA • cpDNA • circular and fairly large (120 – 160 kbp), with only 120 genes • inheritance is either maternal, or paternal Structure of DNA in the eukaryotic cell • DNA in human chromosomes: 3.2 109 bp. As we’re diploid: • • • • • • 6.4 109 bp. 0.33 nm per bp 2.1 m in each nucleus, size of the nucleus: 5-10 m across DNA is highly compacted. Combination DNA + proteins. During interphase, when cells are not dividing, the genetic material exists as a nucleoprotein complex called chromatin, which is dispersed through much of the nucleus. Further folding and compaction of chromatin during mitosis produces the visible metaphase chromosomes. euchromatin – extended heterochromatin – condensed Chromatin nucleosome Nucleosome Central dogma of molecular biology Wikipedia Molecular Cell Biology, Harvey Lodish