Chemoinformatics, cheminformatics, chemical informatics: What is it? Gary Wiggins and Wendie Shreve Chemistry Library Indiana University 1 Abstract The terms “cheminformatics,” “chemiinformatics,” “cheminformatics,” and “chemical informatics” are all used to describe a broad array of computer techniques and applications to solve chemistry problems. We will look at the areas that comprise chemical informatics by examining the topics in existing textbooks and other secondary sources. The identified topics will be mapped to the graduate courses in the Chemical Informatics program at Indiana University 2 Dmitrii Ivanovich Mendeleev, 1834-1907 Discoverer of the Periodic Table— An Early “Chemoinformatician” 3 Why Mendeleev? Faced with a large amount of data, with many gaps, Mendeleev: Sought patterns where none were obvious, Made predictions about properties of unknown chemical substances, based on observed properties of known substances, Created a great visualization tool! 4 The Periodic Table of the Elements by Mark Winter 5 Chemical Informatics: The New “Handmaid” of Chemistry M.G. Mellon noted that Analytical Chemistry was at one time considered the handmaid of chemistry. (his Chemical Publications, 5th ed.) Handmaid (def) – Something that is necessarily subservient or subordinate to another: Ceremony is but the handmaid of worship. (Also, Handmaiden) --Random House Unabridged Dictionary, 2nd ed., 1993. Handmaid, maybe, but definitely not handmade! 6 What is Chemical Informatics? Chemical informatics helps chemists investigate new problems and organize and analyze scientific data to develop novel compounds, materials, and processes through the application of information technology. 7 Cheminformatics, etc. in the Lit, March 2000 Prevalance of -informatics terms in the literature # Retrievals Containing Term Term Science Citation Index SciFinder Scholar (Web of Science) Bioinformatics 364 720 Chemical Informatics 20 6 Chemoinformatics included 7 Chemiinformatics included 0 Cheminformatics included 9 8 Cheminformatics, etc. in the Lit, 31 July 2003 Prevalance of -informatics terms in the literature # Retrievals Containing Term Term Science Citation Index SciFinder Scholar (Web of Science) Bioinformatics 1830 5685 Chemical Informatics 13 12 Chemoinformatics 32 42 Chemiinformatics 1 2 Cheminformatics 30 56 9 Indiana University MS in Chemical Informatics Major aspects of chemical informatics Information Acquisition: Methods for generating and collecting data empirically (experimentation) or from theory (molecular simulation) Information Management: Storage and retrieval of information Information Use: Data Analysis, correlation, and application to problems in the chemical and biochemical sciences 10 UMIST MSc in Cheminformatics “This is a modular, one-year course which provides high-level training in the handling of chemical and biochemical information, molecular modelling and other aspects of cheminformatics.” 11 University of Sheffield MSc in Chemoinformatics Program “Chemoinformatics involves the application of IT to chemical data and includes topics such as chemical databases, combinatorial library design, structureactivity relationships and structure-based drug design.” 12 Sheffield’s Short Course Offered for the past three summers, in 4 days, emphasizes applications in modern drug discovery Covers: 2D databases and database searching Diversity and compound selection Moving into 3D: experimental data sources Computational methods for 3D 13 Sheffield Short Course Coverage (continued): 3D databases Combinatorial libraries Analysis of high-throughput screening data 14 Graduate Courses in Chemical Informatics at Indiana University C571 Chemical Information Technology http://www.indiana.edu/~cheminfo/C571/571home.html C572 Molecular Modeling & Computational Chemistry http://www.indiana.edu/~cheminfo/C572/572home.html 15 JCICS – Major Research Areas Chemical Information Text Searching Structure and Substructure Searching Databases Patents George W.A. Milne C571 Lecture Fall 2002 16 JCICS – Major Research Areas Chemical Computation Quantum Mechanics Statistics (regression, neural nets, etc.) QSAR, QSPR Graph Theory DNA Computing George W.A. Milne C571 Lecture Fall 2002 17 JCICS – Major Research Areas Molecular Modeling 3D Structure Generation 3D Searching (pharmacophores) Docking George W.A. Milne C571 Lecture Fall 2002 18 JCICS – Major Research Areas Biopharmaceutical Computation Drug Design Combinatorial Chemistry Protein and Enzyme Structure Membrane Structure ADME-related Research George W.A. Milne C571 Lecture Fall 2002 19 Desirable Skills for Chemistry Grads George W.A. Milne C571 Lecture, Fall 2002 20 Frank Brown’s Definition …the mixing of information resources to transform data into information and information into knowledge, for the intended purpose of making decisions faster in the arena of drug lead identification and optimisation. Brown, F.K. “Chemoinformatics, what it is and how does it impact drug discovery.” Annual Reports in Medicinal Chemistry, 1998, 33, 375-384. 21 Application of Cheminformatics in the Drug Industry The computer is used to analyze the interactions between the drug and the receptor site and design molecules with an optimal fit. Once targets are developed, libraries of compounds are screened for activity with one or more relevant assays using High Throughput Screening. 22 Application of Cheminformatics in the Drug Industry Hits are then evaluated for binding, potency, selectivity, and functional activity. Seeking to improve: Potency Absorption Distribution Metabolism Elimination 23 Some Methods and Tools Structure/Activity Relationships Genetic Algorithms Statistical Tools (e.g., recursive pairing) Data Analysis Tools Visualization Hardware Developments Chemically-Aware Web Language (CML) 24 CAS Indexing of a Relevant Article “The impact of informatics and computational chemistry on synthesis and screening.” Manly, Charles J.; LouiseMay, Shirley; Hammer, Jack D. Drug Discovery Today (2001), 6(21), 1101-1110. A review with 87 references 25 Controlled Vocabulary Indexing of the Manly Article Chemistry High throughput screening Drug screening Bioinformatics Combinatorial chemistry Drug design Molecular modeling Pharmacokinetics Combinatorial library 26 Informatics Components (per Dow Chemical Visitors) Architecture LIMS Substance Registry Components Electronic Records Mgmt of an Informatics Process Data Mgmt System Integration & User Interface 27 Chemical R&D vs. Pharmaceutical R&D Much smaller number of substances tested in a week Much larger number of tests to consider Answers tend to come in shades of gray rather than yes or no Targets change frequently in chemical R&D Must integrate a large variety of sources that were not designed for integration New approach to taxonomy is needed. --L. David Rothman The Dow Chemical Co. 28 Characteristics of a Chemical Informatics Faculty Member Appreciates the value of algorithms Is interested in data mining, data modeling, and relational database systems Pays attention to searching issues and the literature Has compatability and commonality with bioinformatics research Is able to talk to computer scientists. 29 Major Journals Journal of Chemical Information and Computer Sciences (ACS) Journal of Molecular Graphics and Modelling (Elsevier) Journal of Combinatorial Chemistry (ACS) Journal of Proteome Research (ACS) Proteomics (Wiley-VCH) Molecular and Cellular Proteomics (ASBMB) Acta Crystallographica (IUCr) 30 Textbooks Leach, Andrew R.; Gillet, Valerie J. An Introduction to Chemoinformatics. Kluwer, 2003. ISBN 1-4020-1347-7 Engel, Thomas. Chemoinformatics: A Textbook. Wiley-VCH, expected date of publication: August 2003. ISBN 3-52730681-1 31 Reference Works Encyclopedia of Computational Chemistry, Schleyer, P. von R.; Allinger, N.L.; Clark, T.; Gasteiger, J.; Kollman, P.A.; Schaefer, H.F.; Shreiner, P.R. (Eds.). 5 v. Wiley, Chichester, 1998. Gasteiger, Johann J., ed. Handbook of Chemoinformatics: From Data to Knowledge. 4 v. Wiley-VCH, expected date of publication August 2003. ISBN 3-527-30680-3 Reviews in Computational Chemistry. Wiley-VCH, 1990Paris, Greg. Bibliography: Chemical Information Retrieval and 3D Searching. http://panizzi.shef.ac.uk/cisrg/links/grep/chemDB.4.html SIRCh: Chemical Informatics Home Page at Indiana University http://www.indiana.edu/~cheminfo/informatics/cinformhome.html 32 Conclusion Chemical Informatics is an evolving field with many facets. It will become increasingly important in areas of chemistry outside the drug industry. It will play an increasing role in the developing area of proteomics. 33 Bibliography Brown, F.K. “Chemoinformatics, what it is and how does it impact drug discovery.” Annual Reports in Medicinal Chemistry, 1998, 33, 375-384. Glen, Robert. “Developing tools and standards in molecular informatics.” Chemical Communications, 2002, (23), 2745-2747. Hann, Mike; Green, Richard. “Chemoinformatics—a new name for an old problem?” Current Opinion in Chemical Biology, 1979, 3, 379-383. Lipinski, C.A.; Lombardo, F.; Dominy, B.W.; Feeney, P.J. “Experimental and computational approaches to estimate the solubility and permeability in drug discovery and development settings. Advanced Drug Delivery Reviews, 1997, 23, 3-15. Rosso, Eugene. “Chemistry plans a structural overhaul.” Nature (Naturejobs) 12 September 2002, 419(6903). http://www.nature.com/naturejobs/careersandrecruitment/2002.html Rothman, L. David. “Information management for research in the chemical industry.” Abstracts of Papers, 223rd ACS National Meeting, Orlando, FL, United States, April 7-11, 2002 (2002), CINF-044. Smith, Chris. “Cheminformatics: Redefining the crucible.” The Scientist, 2002, 16(8), 40. 34