Overview of Bioinformatics 1 Module Denis Manley. Contact Details • Lecturer Name: Denis Manley • Room number: KE-1-013a • Email : denis.manley@dit.ie • Website: www.comp.dit.ie/dmanley • Phone: 01 402 4949 What is bioinformatics • Bioinformatics is the use of computers and computational methods to analyse large sets of molecular biological data that is used for : – The investigation of “living organisms” and their evolution. – The discovery of genes, gene regulation; genetic networks and protein functionality, which can be used to understand: human disease; human development (conception to adulthood) etc . – the results of which can facilitate our understanding of diseases like cystic fibrosis; suggest therapies; and the development of cures such as drug development, viral therapy… Reading DNA novels: “bioinformatics” • Analysing large sets of data is “equivalent” to reading and understanding a book (Computational linguistics). The syntax: – Reading involves looking at letters [ including spaces and punctuation] to determine the words. Bioinformatics associated with DNA reads 4 letters (referred to by letters ATGC) and can help determining location of genes or other important elements of DNA (correspond to words). Reading DNA novels: “bioinformatics” – The next step in reading involves determining if the words are nouns/verbs/adverbs etc… In general there are rules: “what are they ?” – Bioinformatics involves determining what are the “important elements” correspond to: e.g. genes; gene promoters…. – However, clearly the rules to determine “genes” and other elements are different than in a natural language and more importantly are sometimes being modified (if something new is discovered). Reading DNA novels: “bioinformatics” – syntax: – The next step in determining the sequence of the words.; e.g. should it be“what are the rules of english grammar”; or “are what the rules of grammar english“ – Bioinformatics involves determining the sequence of “important elements”; e.g. promoter are “upstream of genes and not the other way around. Reading DNA novels: “bioinformatics” – Symantics: – What does the set of words (sentence) mean. “what is your purpose?” what processes do humans use to interpret this sentence – Bioinformatics attempts to analyse the function of DNA/genetic sequences by: e.g. 1. comparing the sequences to sequences whose function is already known. 2. By converting the sequence into its equivalent “protein” and comparing it to known proteins 3. determining 3-D structure of proteins and looking for known structural components. Reading DNA novels: “bioinformatics” • Bioinformatics also focuses on the computational aspects of the discipline such as: – Setting up databases – Writing code to perform analysis – Determining and Utilisation of known computational techniques to improve analysis of the biological data. • Bioinformatics, covers a very large area but this particular module will focus on the “computational analysis of genetic systems” and will be referred to as Bioinformatics 1. Bioinformatics 1: module syllabus. • Part 1: Fundamental of genetic systems: • Principles of inheritance and evolutions: essential criteria for our evolution and existence. • Basic Molecular cell biology: DNA , Genes and Amino acids (proteins) . • The relationship between a gene and its physical manifestation (proteins); The central “dogma” of Genetics: DNA -> RNA->Proteins • Introduction to structural elements of genetic systems • Examples of Gene “expression” regulation Bioinformatics 1: module syllabus. • Part 2: Programming in “PERL”: a common scripting language used in the field of bioinformatics • Fundamentals of Perl: read/write, loops…. • Fundamental Perl data structures: “bioinformatics“ data files; dynamic arrays and hash tables. • Perl Pattern matching techniques (regular expressions) used in bioinformatics: searching for a pattern (e.g. ATG); extract a pattern from a sequence; substitute one pattern for another (e.g. replace T with a U) • Create perl sub-routines and Perl modules and use them in other perl programs • Development of “basic” bioinformatics data sequences analytical tools using perl and core computational algorithms [these algorithms will be covered in the computational element of the module]. What is bioinformatics – Part 3a: Introduction to online bioinformatics resources; • How and where to obtain “bioinformatics” DNA data sequences and data relevant to these sequences • Explanation of the different elements of these data sets “data annotation” or (meta data). • Fundamentals of common online DNA analytical tools (such as sequence alignment measurement ) What is bioinformatics • Part 3b: – Computational bioinformatics for: • DNA pattern matching: global/local/multiple • Align DNA sequences: e.g. Pairwise alignment • Application of alignment principles using basic computational methods • Reconstruct genomes (large DNA sequences) using “shot-gun” alignment techniques • Principles of searching for “matching” DNA (gene) sequences in large online databases. • How to utilise and interpret findings of DNA database searches: e.g. gene functionality. Assignment and exam • 1 Assignment (40%): – Developing an application to analyse “small” DNA data sequences – A report discussing the findings of the on-line applications when applied to known DNA sequences • Exam: question 1 + 2 out of 4 other question (60%) – Question 1 compulsory: Bioinformatics Perl programming . – Other questions related to the other areas in the module. Proposed schedule • Week 2 to Week 6 (Thursday 18:00 to 20:00): “Part 1 Fundamental of genetic systems.” • Week 2 to Week 6 (Thursday 20:15 to 21:15) – Perl programming for bioinformatics • Week 7 review week [submit assignment part 1] Proposed schedule • Week 8 to Week 12 (Thursday 18:00 to 20:00): “Computational techniques and their application to bioinformatics” • Week 8 to Week 12 (Monday 20:15 to 21:15) – Online bioinformatics databases and analytical applications (approx 2 weeks). – Development of fundamental computational applications using perl (approx 3 weeks) • Week 10 submission of assignment part 2 • Week 13 review of course and sample exam paper Assignment content • Assignment 1: – A report on the analysis on the biological impact of developing “ a bioinformatics applications. – Development of the fundamental functionality of the application based on the findings of the report • Assignment 2: – Using the application from assignment 1: Analysis and development of the analytical component “computational analysis” of the application. – A report on the findings of applying the final application to a given dataset obtained from online bioinformatics databases.