Attila Kertész-Farkas’ CV Curriculum Vitae Kertész-Farkas, Attila, Ph.D. Post-Doctoral fellow Protein Structure and Bioinformatics Group, International Centre of Genetic Engineering and Biotechnology, Trieste, Italy, Personal Data Born: November 26, 1979 in Kiskunmajsa (Hungary) Hungarian nationality, Male, Single, Address: ICGEB, Padriciano 99, Trieste, Italy, 34149 e-mail: kfattila@icgeb.org phone: +39 346 0396947 Former Affiliations (postdocs) 2008-2009 Division of Imaging and Applied Mathematics, Center for Devices and Radiological Health, U.S. Food and Drug Administration, USA, Maryland, USA. 2008-2009 Department of Biology, University of Maryland Baltimore County (UMBC) Education 2004-2008 PhD studies at the University of Szeged, Szeged (Hungary). 1999-2004 University of Szeged, MSc in Computer Science and Mathematics (specialising in Fundamentals of Computer Science) with Summa cum laude. Title of thesis work: Compact representation of finite languages with nondeterministic automata (in Hungarian) 1994-1998 High school, Kiskunhalas (Hungary), special informatics class. Scholarship 2006-2007 Erasmus, one winter semester, Technische Universitaet Dresden, Germany Awards 2004. Apr. 1st prize at a Scientific Conference of Students for the article entitled Kernel-based learning with feature weighting. (In Hungarian, TDK) 2002. Nov. 1st prize at a Scientific Conference of Students for the article entitled Compact representation of finite languages with nondeterministic automata. (In Hungarian, TDK) Attila Kertész-Farkas’ CV Experience and research topics Research activity: 2009- I am working on two projects. The first project relates to tandem mass spectrometry. I am developping a de novo algorithm for identification of post-trasnlationally modifications in peptide mass spectra. The second project is the analysis of the integration of biological and medical databases. 2008 -2009 During my Postdoctoral fellowship in USA I had been involved mutation extraction from scientific literature using natural Language processing techniques and I also had been working on classification of disease mutations using machine learning techniques. 2004-2008 During my PhD studies at the Research Group on Artificial Intelligence my primary research areas were statistical machine learning and bioinformatics including structural and functional protein classification; protein ranking; design of similarity function for protein sequences, learning of similarity and spectral decomposition of similarity; kernel functions for strings and discrete structures; operations research; DNA micro-array processing; constructing protein databases and benchmark collections for classification. The developed methods were evaluated mostly with Matlab but several other technologies were used such as JAVA, C and MySQL under Linux and Windows systems. I collaborated with several research groups from Europe and the results were presented via perhaps the best regarded publication channels. For details see my publication list and affiliations. 2000-2004 During my undergraduate studies at the Research Group on Artificial Intelligence my tasks were developing and implementing in various environment the following linguistics software projects to aid the speech recognition system for the Hungarian language. A neural network editor to carry out artificial neural network structures A context-free grammar editor to design a language model for speech recognition A phonetic transcriptor for the Hungarian language A nondeterministic automata reduction algorithm to reduce the size of the language model Morphological parser for the Hungarian language Talks: Protein databases and similarity measures for protein classification, Bioinformatics seminar at Renyi Mathematical Institute, Budapest on 3.10.2007. (Hungary) Equivalence Learning in Protein Classification, Young Researcher Symposium on Intelligent Systems at the John von Neumann Computer Society in Budapest on 23.11.2007. (Hungary) Participation in projects: 2001-2002 Code SZT-IS-10 Open Source Segmentation and Annotation Software for Hungarian Speech Databases 2000-2002 Code IKTA-3/049/2000 Establishing a Hungarian Telephone Speech Database Attila Kertész-Farkas’ CV Teaching experiences: Operational Research (tutorial) Introduction to Pascal and C programming (tutorial) Introduction to Java programming (tutorial) Algorithms and data structures (tutorial) Technical Program Committee: International Conference on BioComputation, Bioinformatics, and Biomedical Technologies BIOTECHNO 2008 Memberships: The John von Neumann Computer Society Eötvös Collegium Computer skills: Matlab (using toolboxes such as Bioinformatics, Mosek, LSSVMLab, NeuralNet, Spider, Statistics, Optimization toolbox) Weka, C/C++, Visual C++ with MFC Java, J2ME, JBuilder and JBuilder Mobil Set, Nokia MobileToolset (MIDP, CLDC) Pascal, Prolog, OpenGL, Python, Internet programming, Perl Relational databases, SQL, ODBC, JDBC Windows, Linux, MS Office, Latex Languages: Hungarian: native English: fluent German: basic level January 24, 2011 Attila Kertesz-Farkas Attila Kertész-Farkas’ CV Publications Number of independent references is 40 (at least). Book chapter 1. Attila Kertész-Farkas1, András Kocsor3 and Sándor Pongor4,5, The application of Data Compression-based Distance to Biological Sequences, In: Frank Emmert-Streib. (Ed.) Information Theory and Statistical Learning, (2008), p. 73-88 Refereed journal papers 2. E. Doughty11, A. Kertesz-Farkas10,11, O. Bodenreider12, G. Thompson11, A. Adadey11, T. Peterson11, and M. G. Kann11 Toward an automatic method for extracting cancer- and other disease-related point mutations from the biomedical literature, Bioinformatics (2010), in press Shared first authorship. 3. Dhir S.4, Pacurar M.4, Franklin D.4, Gáspári Z.7, Kertész-Farkas A.4, Kocsor, A.3, Eisenhaber F., Pongor S.4,5(2010) Detecting atypical examples of known domain types by sequence similarity searching: The SBASE domain library approach, Current Protein Peptide Science, 2010, in press 4. József Dombi8 and Attila Kertész-Farkas1, Applying Fuzzy Technologies to Equivalence Learning in Protein Classification, Journal of Computational Biology, 16(4), 2009, p. 611-623 5. Róbert Busa-Fekete1, Attila Kertész-Farkas1, András Kocsor3 and Sándor Pongor4,5, Balanced ROC analysis (BAROC) protocol for the evaluation of protein similarities, Journal of Biochemical and Biophysical Methods, 70(6), 2008, p. 1210-1214 6. Attila Kertész-Farkas1,2, András Kocsor1,3 and Sándor Pongor4,5, Equivalence Learning in Protein Classification, In: P. Perner (Ed.) Machine Learning and Data Mining in Pattern Recognition, LNAI (4571), Springer Verlag, Heidelberg, 2007, p. 824-837 7. Attila Kertész-Farkas1, Somdutta Dhir4, Paolo Sonego4, Mircea Pacurar4, Sergiu Netoteia5, Harm Nijveen6, Arnold Kuzniar6, Jack A.M. Leunissen6, András Kocsor1 and Sándor Pongor4,5 Benchmarking protein classification algorithms via supervised cross-validation, Journal of Biochemical and Biophysical Methods, 70(6), 2008, pp. 1215-1223 8. Paolo Sonego4, Mircea Pacurar4, Somdutta Dhir4, Attila Kertész-Farkas1, András Kocsor1, Zoltán Gáspári7, Jack A.M. Leunissen6 and Sándor Pongor4,5, A Protein Classification Benchmark collection for Machine Learning, Nucleic Acids Research (35), 2006, D232-6 9. János Z. Kelemen8, Attila Kertész-Farkas1, András Kocsor1 and László G. Puskás8, Kalman Filtering for Disease-State Estimation from Microarray Data, Bioinformatics (22), 2006, pp 30473053 10. László Kaján4, Attila Kertész-Farkas1, Dino Franklin4, Nelly Ivanova4, András Kocsor1 and Sándor Pongor4,5, Application of a simple log likelihood ratio approximant to protein sequence classification, Bioinformatics (22), 2006, pp 2865-2869 Attila Kertész-Farkas’ CV 11. András Kocsor1, Attila Kertész-Farkas1, László Kaján4 and Sándor Pongor4,5, Application of compression-based distance measures to protein sequence-classification: a methodological study, Bioinformatics (22), 2006, pp 407-412 Other journal papers 12. Attila Kertész-Farkas1 and András Kocsor1, Kernel-based Classification of Tissues using Feature Weightings, Applied Ecology and Environmental Research 4(2), 2006, pp. 63-71 13. Attila Kertész-Farkas1, Zoltán Fülöp9 and András Kocsor1, Compact Representation of Hungarian Corpora (in Hungarian), Hungarian Journal of Applied Linguistics (1-2), 2005, pp. 63-70 Conferences 14. Attila Kertész-Farkas1,2, András Kocsor1,3 and Sándor Pongor4,5, Equivalence Learning in Protein Classification, International Conference on Machine Learning and Data Mining in Pattern Recognition MLDM 2007, 18-20 July 2007, Leipzig (Germany) 15. Attila Kertész-Farkas1 and András Kocsor1, Classification of Tissues using Feature Weightings (in Hungarian), VII. Hungarian Conference on Biometrics and Biomathematics, 5th-6th July 2005, Budapest (Hungary). 16. Attila Kertész-Farkas1, Zoltán Fülöp9 and András Kocsor1, Compact Representation of Hungarian Vocabulary with Nondeterministic Finite Automata I. Conference on Hungarian Computational Linguistic (in Hungarian, I. Magyar Számítógépes Nyelvészet Konferencia) 10th-11th December 2003, Szeged pp. 231-237, (Hungary) 17. Attila Kertész-Farkas, Kernel-based learning with dimension weighting (in Hungarian), Scientific Conference of Students, spring 2004, Supervisor: András Kocsor1 18. Attila Kertész-Farkas, Compact representation of finite languages with nondeterministic automata (in Hungarian), Scientific Conference of Students, autumn 2002, Supervisors: Zoltán Fülöp9 and András Kocsor1 Poster 19. Attila Kertesz-Farkas10,11, Olivier Bodenreider12, Trevor C. Suznick11, Gary Thompson11, Yanan Sun11 and Maricel G. Kann11, EMU: A tool for the Extraction of MUtations with disease associations from literature Growth Factor and Signal Transduction Conferences: System Biology, Integrative, Comaparitve and Multi-scale Modeling, 11–14 June, Iowa, USA Attila Kertész-Farkas’ CV Co-author’s Affiliations: 1 Research Group on Artificial Intelligence, Szeged (Hungary), 2 Erasmus Program, Technische Univeritaet Dresden (Germany), 3Applied Intelligence Laboratory, Szeged, (Hungary), 4 Bioinformatics Group, International Centre for Genetic Engineering and Biotechnology, Trieste (Italy) 5 Bioinformatics Group, Biological Research Centre, Szeged, (Hungary) 6 Laboratory of Bioinformatics, Wageningen University and Research Centre, Wageningen, (The Netherlands) 7Institute 8 of Chemistry, Eötvös Loránd University, Budapest (Hungary) Laboratory of Functional Genomics at Biological Research Centre, Szeged (Hungary) 9 Department of Foundations of Computer Science, University of Szeged, Szeged (Hungary) 10Division of Imaging and Applied Mathematics, Centre for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, MD, 20993, USA 11University 12National of Maryland, Baltimore County, Baltimore, MD 21250, USA, Library of Medicine, Bethesda, MD 20894, USA January 24, 2011 Attila Kertész-Farkas