Supplement 1 for Table 4 . 79 highly cited biomedical text mining papers Annual Average No Title Author Country Source of Literature Year of 1 Systematic and integrative Huang, Da Wei; Sherman, analysis of large gene lists Brad T.; Lempicki, Richard A. Times publication Times Cited USA NATURE PROTOCOLS 2009 3177 529.50 UK NUCLEIC ACIDS 2004 366 33.27 2008 274 39.14 using DAVID bioinformatics resources 2 The Gene Ontology Camon, E; Magrane, M; Annotation (GOA) Barrell, D,ect RESEARCH Database: sharing knowledge in Uniprot with Gene Ontology 3 RegulonDB (version 6.0): Gama-Castro, Socorro; gene regulation model of Jimenez-Jacinto, Veronica; Escherichia coli K-12 Peralta-Gil, Martin,ect Morelos NUCLEIC ACIDS RESEARCH beyond transcription, active (experimental) annotated promoters and Textpresso navigation 4 Cluster analysis for gene Jiang, DX; Tang, C; Zhang, expression data: A survey AD USA IEEE TRANSACTIONS 2004 269 24.45 2006 237 26.33 ON KNOWLEDGE AND DATA ENGINEERING 5 Literature mining for the Jensen, LJ; Saric, J; Bork, P Germany biologist: from information NATURE REVIEWS GENETICS retrieval to biological discovery 6 Computational cluster Handl, J; Knowles, J; Kell, validation in post-genomic DB UK BIOINFORMATICS 2005 236 23.60 Netherlands IEEE TRANSACTIONS 2005 228 22.80 data analysis 7 Clustering by compression Cilibrasi, R; Vitanyi, PMB ON INFORMATION THEORY 8 The Mouse Genome Bult, Carol J.; Eppig, Janan T.; Database (MGD): mouse Kadin, James A. ,ect USA. NUCLEIC ACIDS 2008 222 31.71 RESEARCH biology and model systems 9 Textpresso: An ontology- Muller, HM; Kenny, EE; based information retrieval Sternberg, PW USA. PLOS BIOLOGY 2004 201 18.27 USA. BRIEFINGS IN 2005 177 17.70 and extraction system for biological literature 10 A survey of current work in Cohen, AM; Hersh, WR biomedical text mining 11 Textpresso: An ontologybased information retrieval and extraction system for biological literature 12 13 BABELOMICS: a systems biology perspective in the functional annotation of genome-scale experiments Overview of BioCreAtIvE: BIOINFORMATICS Muller, HM; Kenny, EE; Sternberg, PW USA Al-Shahrour, Fatima; Minguez, Pablo; Tarraga, Joaquin; Montaner, David; Alloza, Eva; Vaquerizas, Juan M.; Conde, Lucia; Blaschke, Christian; Vera, Javier; Dopazo, Joaquin Spain Hirschman, L; Yeh, A; USA,Spain PLOS BIOLOGY NUCLEIC ACIDS RESEARCH BMC BIOINFORMATICS 171 19 168 24 151 16.8 2004 2006 2005 14 15 16 17 18 critical assessment of Blaschke, C; Valencia, A information extraction for biology van Driel, MA; Bruggeman, J; A text-mining analysis of the Vriend, G; Brunner, HG; human phenome Leunissen, JA A systematic review of modafinil: Potential clinical uses and mechanisms of action Ballon, JS; Feifel, D Scheer, Maurice; Grote, Andreas; Chang, Antje; Schomburg, Ida; Munaretto, Cornelia; Rother, Michael; Soehngen, Carola; Stelzer, BRENDA, the enzyme Michael; Thiele, Juliane; information system in 2011 Schomburg, Dietmar FatiGO+: a functional profiling tool for genomic data. Integration of Al-Shahrour, Fatima; functional annotation, Minguez, Pablo; Tarraga, regulatory motifs and Joaquin; Medina, Ignacio; interaction data with Alloza, Eva; Montaner, David; microarray experiments Dopazo, Joaquin GeneWays: a system for Rzhetsky, A; Iossifov, I; Koike, extracting, analyzing, T; Krauthammer, M; Kra, P; Netherlands EUROPEAN JOURNAL OF HUMAN GENETICS USA JOURNAL OF CLINICAL PSYCHIATRY Germany Spain USA,Japan NUCLEIC RESEARCH NUCLEIC RESEARCH JOURNAL BIOMEDICAL 150 18.8 144 18 135 45 132 18.9 123 12.3 2006 2006 ACIDS 2011 ACIDS 2007 OF 2004 19 20 21 22 23 visualizing, and integrating Morris, M; Yu, H; Duboue, PA; molecular pathway data Weng, WB; Wilbur, WJ; Hatzivassiloglou, V; Friedman, C ABNER: an open source tool for automatically tagging genes, proteins and other entity names in text Settles, B Consolidating the set of known human proteinprotein interactions in preparation for large-scale mapping of the human Ramani, AK; Bunescu, RC; interactome Mooney, RJ; Marcotte, EM Discovering disease-genes by topological features in human protein-protein interaction network Xu, Jianzhen; Li, Yongjin Multilabel neural networks with applications to functional genomics and text Zhang, Min-Ling; Zhou, Zhicategorization Hua Integration of text- and datamining using ontologies Tiffin, N; Kelso, JF; Powell, successfully selects disease AR; Pan, H; Bajic, VB; Hide, gene candidates WA INFORMATICS USA USA BIOINFORMATICS 117 14.6 116 12.9 115 16.4 113 16.1 109 15.6 2005 GENOME BIOLOGY Peoples R China BIOINFORMATICS Peoples R China IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING South Africa,Singapore NUCLEIC RESEARCH 2005 2006 2006 ACIDS 2005 24 25 26 Molecular characterization of pediatric gastrointestinal stromal tumors The Human Serum Metabolome Discovery of drug mode of action and drug repositioning from Agaram, Narasimhan P.; Laquaglia, Michael P.; Ustun, Berrin; Guo, Tianhua; Wong, Grace C.; Socci, Nicholas D.; Maki, Robert G.; DeMatteo, Ronald P.; Besmer, Peter; Antonescu, Cristina R. USA Psychogios, Nikolaos; Hau, David D.; Peng, Jun; Guo, An Chi; Mandal, Rupasri; Bouatra, Souhaila; Sinelnikov, Igor; Krishnamurthy, Ramanarayan; Eisner, Roman; Gautam, Bijaya; Young, Nelson; Xia, Jianguo; Knox, Craig; Dong, Edison; Huang, Paul; Hollander, Zsuzsanna; Pedersen, Theresa L.; Smith, Steven R.; Bamforth, Fiona; Greiner, Russ; McManus, Bruce; Newman, John W.; Goodfriend, Theodore; Wishart, David S. Canada,USA Iorio, Francesco; Bosotti, Roberta; Scacheri, Emanuela; Belcastro, Vincenzo; Italy CLINICAL RESEARCH 108 18 105 35 102 25.5 CANCER PLOS ONE PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE 2008 2011 2010 transcriptional responses 27 28 29 30 31 Mithbaokar, Pratibha; Ferriero, Rosa; Murino, Loredana; Tagliaferri, Roberto; BrunettiPierri, Nicola; Isacchi, Antonella; di Bernardo, Diego Text mining and its potential applications in systems Ananiadou, Sophia; Kell, biology Douglas B.; Tsujii, Jun-ichi Larranaga, Pedro; Calvo, Borja; Santana, Roberto; Bielza, Concha; Galdiano, Josu; Inza, Inaki; Lozano, Jose A.; Armananzas, Ruben; Machine learning in Santafe, Guzman; Perez, Aritz; bioinformatics Robles, Victor Drosophila DNase I footprint database: a systematic genome annotation of transcription factor binding sites in the fruitfly, Drosophila Bergman, CM; Carlson, JW; melanogaster Celniker, SE Text mining and ontologies in biomedicine: Making Spasic, I; Ananiadou, S; sense of raw text McNaught, J; Kumar, A Experiencing SAX: a novel Lin, Jessica; Keogh, Eamonn; UK,JAPAN Spain,USA UNITED STATES AMERICA OF TRENDS BIOTECHNOLOGY IN BRIEFINGS BIOINFORMATICS UK,USA BIOINFORMATICS UK USA BRIEFINGS BIOINFORMATICS DATA MINING 14.6 2006 102 97 13.9 93 10.3 88 9.78 87 14.5 IN 2006 2005 IN AND 2005 2007 symbolic representation of Wei, Li; Lonardi, Stefano time series 32 33 34 35 Alternative splicing in cancer: Noise, functional, or Skotheim, Rolf I.; Nees, systematic? Matthias Hedgehog signaling pathway and gastric cancer Katoh, Y; Katoh, M Wishart, David S.; Lewis, Michael J.; Morrissey, Joshua A.; Flegel, Mitchel D.; Jeroncic, Kevin; Xiong, Yeping; Cheng, Dean; Eisner, Roman; Gautam, Bijaya; Tzur, Dan; Sawhney, Summit; The human cerebrospinal Bamforth, Fiona; Greiner, fluid metabolome Russ; Li, Liang Medina, Ignacio; Carbonell, Jose; Pulido, Luis; Madeira, Sara C.; Goetz, Stefan; Conesa, Babelomics: an integrative Ana; Tarraga, Joaquin; platform for the analysis of Pascual-Montano, Alberto; transcriptomics, proteomics Nogales-Cadenas, Ruben; and genomic data with Santoyo, Javier; Garcia, advanced functional Francisco; Marba, Martina; profiling Montaner, David; Dopazo, Japan KNOWLEDGE DISCOVERY INTERNATIONAL JOURNAL OF BIOCHEMISTRY & CELL BIOLOGY CANCER BIOLOGY & THERAPY Canada JOURNAL OF CHROMATOGRAPHY BANALYTICAL TECHNOLOGIES IN THE BIOMEDICAL AND LIFE SCIENCES Finland,Norway Spain,Portugal NUCLEIC RESEARCH 86 14.3 85 10.6 81 16.2 80 20 2007 2005 2008 ACIDS 2010 Joaquin 36 37 38 39 40 41 Understanding ZHENG in traditional Chinese medicine in the context of neuroendocrine-immune network A human functional protein interaction network and its application to cancer data analysis Li, S.; Zhang, Z. Q.; Wu, L. J.; Zhang, X. G.; Li, Y. D.; Wang, Y. Y. Peoples R China Wu, Guanming; Feng, Xin; Stein, Lincoln Canada,USA Zweigenbaum, Pierre; Frontiers of biomedical text Demner-Fushman, Dina; Yu, mining: current progress Hong; Cohen, Kevin B. France,USA Metabolomics, modelling and machine learning in systems biology - towards an understanding of the languages of cells Kell, DB UK Corpus annotation for mining biomedical events Kim, Jin-Dong; Ohta, Tomoko; from literature Tsujii, Jun'ichi Japan,UK HomoMINT: an inferred human network based on orthology mapping of protein interactions Persico, M; Ceol, A; Gavrila, discovered in model C; Hoffmann, R; Florio, A; organisms Cesareni, G Italy,USA IET SYSTEMS BIOLOGY GENOME BIOLOGY BRIEFINGS BIOINFORMATICS 79 11.3 73 18.3 73 12.2 73 9.13 73 12.2 73 8.11 2007 2010 IN FEBS JOURNAL BMC BIOINFORMATICS BMC BIOINFORMATICS 2007 2006 2008 2005 42 43 44 45 46 47 Linking genes to literature: text mining, information extraction, and retrieval applications for biology Identification and characterization of human HES2, HES3, and HES5 genes in silico Text-mining and information-retrieval services for molecular biology PolySearch: a web-based text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites Systematic association of genes to phenotypes by genome and literature mining Evaluation of text-mining systems for biology: overview of the Second BioCreative community Krallinger, Martin; Valencia, Alfonso; Hirschman, Lynette Spain GENOME BIOLOGY Katoh, M; Katoh, M INTERNATIONAL JOURNAL ONCOLOGY Krallinger, M; Valencia, A Cheng, Dean; Knox, Craig; Young, Nelson; Stothard, Paul; Damaraju, Sambasivarao; Wishart, David S. Korbel, JO; Doerks, T; Jensen, LJ; Perez-Iratxeta, C; Kaczanowski, S; Hooper, SD; Andrade, MA; Bork, P Krallinger, Martin; Morgan, Alexander; Smith, Larry; Leitner, Florian; Tanabe, Lorraine; Wilbur, John; Japan Spain Canada Germany,Canada, Poland USA,Spain 12.2 73 12.2 72 8 70 11.7 70 7.78 68 11.3 2008 OF 2004 GENOME BIOLOGY NUCLEIC RESEARCH 73 2005 ACIDS PLOS BIOLOGY GENOME BIOLOGY 2008 2005 2008 challenge 48 49 50 51 52 53 54 55 Hirschman, Lynette; Valencia, Alfonso The mouse genome database Eppig, Janan T.; Blake, Judith (MGD): new features A.; Bult, Carol J.; Kadin, James facilitating a model system A.; Richardson, Joel E. USA Recognizing names in biomedical texts: a machine Zhou, GD; Zhang, J; Su, J; learning approach Shen, D; Tan, CL Singapore HAGR: the human ageing de Magalhaes, JP; Costa, J; genomic resources Toussaint, O USA Rebholz-Schuhmann, Dietrich; Arregui, Miguel; Gaudan, Text processing through web Sylvain; Kirsch, Harald; services: calling Whatizit Jimeno, Antonio UK Camon, EB; Barrell, DG; An evaluation of GO Dimmer, EC; Lee, V; Magrane, annotation retrieval for M; Maslen, J; Binns, D; BioCreAtIvE and GOA Apweiler, R UK Knowledge discovery in traditional Chinese Feng, Yi; Wu, Zhaohui; Zhou, medicine: State of the art and Xuezhong; Zhou, Zhongmei; perspectives Fan, Weiyu Peoples R China The edaphic factor in the origin of plant species Rajakaruna, N USA Target discovery from data Yang, Yongliang; Adelstein, S. mining approaches James; Kassis, Amin I. USA NUCLEIC RESEARCH ACIDS BIOINFORMATICS NUCLEIC ACIDS RESEARCH BIOINFORMATICS BMC BIOINFORMATICS ARTIFICIAL INTELLIGENCE IN MEDICINE INTERNATIONAL GEOLOGY REVIEW DRUG DISCOVERY TODAY 68 9.71 67 6.7 67 7.44 66 11 63 7 62 8.86 61 6.78 60 12 2007 2004 2005 2008 2005 2006 2004 2009 56 59 60 The application of systems biology to drug discovery 58 61 Japan USA NUCLEIC RESEARCH USA BMC BIOINFORMATICS USA CURRENT OPINION IN CHEMICAL BIOLOGY Expression and genomic Cardoso, J.; Boer, J.; Morreau, profiling of colorectal cancer H.; Fodde, R. Netherlands WNT antagonist, SFRP1, is Hedgehog signaling target The Mouse Genome Database genotypes::phenotypes BioCreAtIvE task IA: gene mention finding evaluation 57 BIOCHIMICA ET BIOPHYSICA ACTAREVIEWS ON CANCER INTERNATIONAL JOURNAL OF MOLECULAR MEDICINE Evidence-Based Annotation of the Malaria Parasite's Genome Using Comparative Expression Profiling Katoh, Y; Katoh, M Blake, Judith A.; Bult, Carol J.; Eppig, Janan T.; Kadin, James A.; Richardson, Joel E. Yeh, A; Morgan, A; Colosimo, M; Hirschman, L Cho, Carolyn R.; Labow, Mark; Reinhardt, Mischa; van Oostrum, Jan; Peitsch, Manuel C. Zhou, Yingyao; Ramachandran, Vandana; Kumar, Kota Arun; Westenberger, Scott; Refour, Phillippe; Zhou, Bin; Li, Fengwu; Young, Jason A.; Chen, Kaisheng; Plouffe, David; Henson, Kerstin; Nussenzweig, Victor; Carlton, Jane; Vinetz, Joseph M.; Duraisingh, Manoj T.; USA PLOS ONE 59 8.43 58 7.25 57 11.4 56 6.22 55 6.88 54 9 2007 2006 ACIDS 2009 2005 2006 2008 Winzeler, Elizabeth A. 62 63 64 65 66 67 Integrating high-throughput technologies in the quest for effective biomarkers for ovarian cancer Evaluation of BioCreAtIvE assessment of task 2 Kulasingam, Vathany; Pavlou, Maria P.; Diamandis, Eleftherios P. Canada Blaschke, C; Leon, EA; Krallinger, M; Valencia, A Spain Al-Shahrour, Fatima; Carbonell, Jose; Minguez, Pablo; Goetz, Stefan; Conesa, Babelomics: advanced Ana; Tarrraga, Joaquin; functional profiling of Medina, Ignacio; Alloza, Eva; transcriptomics, proteomics Montaner, David; Dopazo, and genomics experiments Joaquin Spain Networks Inferred from Biochemical Data Reveal Profound Differences in Alexopoulos, Leonidas G.; Toll-like Receptor and Saez-Rodriguez, Julio; Inflammatory Signaling Cosgrove, Benjamin D.; between Normal and Lauffenburger, Douglas A.; Transformed Hepatocytes Sorger, Peter K. USA Text mining approaches in molecular biology and Krallinger, M; Erhardt, RAA; biomedicine Valencia, A Spain Techniques: Bioprospecting Buenz, EJ; Schnepple, DJ; historical herbal texts by Bauer, BA; Elkin, PL; Riddle, USA NATURE CANCER REVIEWS MOLECULAR CELLULAR PROTEOMICS 13.3 52 5.78 50 8.33 50 12.5 50 5.56 47 4.7 2010 BMC BIOINFORMATICS NUCLEIC RESEARCH 53 2005 ACIDS 2008 & DRUG DISCOVERY TODAY TRENDS IN PHARMACOLOGICAL 2010 2005 2004 68 69 70 71 72 73 74 hunting for new leads in old tomes Cancer systems biology: exploring cancer-associated genes on cellular networks The next generation of literature analysis: Integration of genomic analysis into text mining PROMISCUOUS: a database for network-based drug-repositioning Gene regulatory networks in lactation: identification of global principles using bioinformatics Event extraction for systems biology by text mining the literature Computational methods for Traditional Chinese Medicine: A survey 2-Alkyl-4hydroxymethylfuran-3- JM; Motley, TJ SCIENCES Wang, E.; Lenferink, Connor-McCourt, M. O. Canada CELLULAR MOLECULAR SCIENCES Germany BRIEFINGS BIOINFORMATICS A.; Scherf, M; Epple, A; Werner, T von Eichborn, Joachim; Murgueitio, Manuela S.; Dunkel, Mathias; Koerner, Soeren; Bourne, Philip E.; Preissner, Robert Lemay, Danielle G.; Neville, Margaret C.; Rudolph, Michael C.; Pollard, Katherine S.; German, J. Bruce Ananiadou, Sophia; Pyysalo, Sampo; Tsujii, Jun'ichi; Kell, Douglas B. Germany,USA USA Japan Lukman, Suryani; He, Yulan; Hui, Siu-Cheung UK Corre, Christophe; Song, Lijiang; O'Rourke, Sean; UK NUCLEIC RESEARCH AND LIFE 47 7.83 46 5.75 46 15.3 46 7.67 45 11.3 45 7.5 44 8.8 2007 IN 2005 ACIDS BMC SYSTEMS BIOLOGY TRENDS IN BIOTECHNOLOGY COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE PROCEEDINGS OF THE NATIONAL ACADEMY 2011 2007 2010 2007 2008 75 76 77 78 79 carboxylic acids, antibiotic Chater, Keith F.; Challis, production inducers Gregory L. discovered by Streptomyces coelicolor genome mining Kirouac, Daniel C.; Ito, Caryn; Csaszar, Elizabeth; Roch, Dynamic interaction Aline; Yu, Mei; Sykes, Edward networks in a hierarchically A.; Bader, Gary D.; Zandstra, organized tissue Peter W. LINNAEUS: A species name identification system Gerner, Martin; Nenadic, for biomedical literature Goran; Bergman, Casey M. A Pathway-Based View of Human Diseases and Disease Relationships Li, Yong; Agarwal, Pankaj Comparing Signaling Saez-Rodriguez, Julio; Networks between Normal Alexopoulos, Leonidas G.; and Transformed Zhang, MingSheng; Morris, Hepatocytes Using Discrete Melody K.; Lauffenburger, Logical Models Douglas A.; Sorger, Peter K. Urine proteomics for Kentsis, Alex; Monigatti, profiling of human disease Flavio; Dorff, Kevin; using high accuracy mass Campagne, Fabien; Bachur, spectrometry Richard; Steen, Hanno OF SCIENCES OF THE UNITED STATES OF AMERICA Canada UK USA MOLECULAR SYSTEMS BIOLOGY BMC BIOINFORMATICS PLOS ONE USA CANCER RESEARCH USA PROTEOMICS CLINICAL APPLICATIONS 43 10.8 43 10.8 42 10.5 42 21 42 10.5 2010 2010 2009 2011 2009