Curriculum Vitae --- Xianfeng Jeff Chen, Ph

advertisement
1
Curriculum Vitae --- Xianfeng Jeff Chen, Ph.D.
Executive Summary:
(1) I am a bioinformatics and computational/systems biologist in human, plant, microbe, and with research interests and
over 10 years’ professional experience in bioinformatics, computational biology, genomics, gene expression profiling,
and proteomics with emphasis on genome annotation, network analysis, functional and comparative genomics,
biological database and data integration, algorithm development, software engineering, high throughput distributed
parallel cluster computing, automation of data processing, biological data mining and knowledge discovery.
(2) I had worked at both biotech and pharmaceutical companies for about 8-9 years as a bioinformatician before
returning my academic career for about 4 years. I have had over 10 year genomics experiences on genome-scale
genomic, gene-space, EST sequencing, and micro-array transcription profiling. I had assembled genome-scale human,
mouse, over 5 plant genomes, and about 170 microbiology genomes. I also have had over 6 years proteomics (Mass
Spectrometry, Yeast-Two-Hybrid, and Protein Array) professional experiences on human and category A bio-defense
pathogens; and assembled genome-scale human protein-protein interaction networks for colon cancer drug-able target
proteins mining, target validation, assay development, interactive chemical small molecular screening, and drug
discovery.
Citizenship: The United States of America.
Address: 1726 Webland Park, Charlottesville, VA 22901.
Email: xianfengchen05@gmail.com
Phone : 434-974-7099.
Education:
Ph.D. Major: Genetics
Iowa State University. U.S.A.
1996
B.S./M.Sc. equivalent, Major: Computer Science
Iowa State University.
GPA: 3.98/4.00, completed 18 computer courses including all undergraduate
plus 4 graduate computer courses. Major training had been focused on data
warehouse and software engineering.
1998
Honors and Awards:
Honored as top 2% of Iowa State computer science student
C. R. Weber Award for Excellence of Graduate Studies
Iowa State University.
1998
1996
Areas of Expertise:
(1)
(2)
(3)
(4)
(5)
(6)
(7)
Bioinformatics and computational proteomics, network analysis, data mining and knowledge discovery.
Biological database/data warehouse design, implementation, and management.
Algorithms in computational biology, biological sequence analysis and processing.
Bioinformatics on second generation sequencing technology and disease genetics association study.
Programming language, software engineering, and high throughput distributed parallel cluster computing.
Chemical compound management, library similarity and diversity analysis, and compound clustering.
Genetics and biochemistry, comparative and functional genomics, and systems biology.
Skills in Computational Biology:
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
Dynamic programming, greedy algorithm, and divide-and-conquer strategy.
Needleman-Wunsch algorithm.
Smith-Waterman algorithm.
Pair-wise sequence alignment.
Multiple/progressive pair-wise alignment.
Memory-based reasoning.
Neural network and belief network classifiers.
Decision tree classifier.
Consensus and regular expression pattern match.
Position weight matrix for motif detection.
1
2
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
Profile or template.
Byesian network.
Hidden Markov Model.
Phylogenetics tree classification.
Non-homologous based annotation.
Expert at computer farms such as Loading and Sharing Facility, Portal Batch System, and DeCypher system.
Expert at IDBS’s ACTIVITYBASE chemo-informatics software for interactive small molecular screening
Expert at proteomics software systems such as Genologics Proteus, ISB TPP, Thermo BioWorks and SIEVE
for protein identification and quantitative profiling, Scaffold of Proteomics Software, Scripps proteomics
software systems etc..
Experience in Bioinformatics and Computational Biology:
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
Automation of genomics sequence high throughput analysis and processing using Perl, shell script, C++,
Java, CGI, GUNmaker etc. various information technologies.
Design and implementation of biological databases using PostgreSQL, MySQL, Oracle 10i, Sybase 11, and
Illustra DBMS. Experience at data modeling using ERwin, and database programming using Proc*C++,
PL/SQL, JDBC, DBI/DBD, Oracle scripts etc in Oracle DBMS.
Expert knowledge at biological databases, especially, Transfac, TRRD, TFD, COMPEL, EPD, UTR,
PLACE, PlantCARE, BIND, DIP etc.
Developed a graphic linkage analysis software using Visual C++ as a graduate student and postdoctoral.
Hands-on experience operating software of gene prediction, EST clustering and assembly,
genome annotation, promoter prediction and annotation such as Genscan, MZEF, Genefinder, Grail,
NetPlantGene, Pangea CAT, CAP, Phrap, PromFinder, Promoter Scan, Signal Scan, MatInspector,
Pattern search, SplicePredictor, GeneMark.hmm, Aragen etc.
Experience in incorporating proteomics software such as X!tandem, OMSSA, Sequest, Mascot, TPP,
Scaffold, Qscore, ProteinProphet, PeptideProhet, AMASS, Rscore, Bioworks, and SIEVE etc. and genomics
software such as Blast, Fasta, repeatMasker, cross_match, EMBOSS, HMMER etc. genome analysis tools
into biological data warehouse.
Experience in building databases of promoter sequences, transcription factors and their
binding sites and development of promoter prediction software using regulatory sequence
databases with machine learning and pattern recognition algorithm.
Experience in building company Intranet using HTML, JavaScript, and CGI.
Experience in comparative genomics among arabidopsis, maize, soybean, tobacco, cowpea, and rice.
Design and implementation of metabolic & developmental database of plant.
Experience in inference on protein-protein interaction network from data of yeast two-hybrid
System, mass spectrometry based pull down assay, and protein quantification and profiling.
Major Academic Bioinformatics Databases Developed:
(1)
(2)
(3)
(4)
(5)
http://www.proteomicsresource.org, National Biodefense Proteomics Data Center.
http://geossdev.med.virginia.edu/~xc3m/, Microarray Coexpression Explorer for Cancer Chemotherapy.
http://cowpeagenomics.med.virginia.edu/, Cowpea Genomics Knowledge Base.
http://compsysbio.achs.virginia.edu/tobfac/, Tobacco Transcription Factor Database.
http://xi00.achs.virginia.edu/~xc3m/, UVa Systems Biology Knowledge Warehouse.
Professional Experience:
Bioinformatics Consultant
IFXworks, LLC. (http://www.ifxworks.com), Dulles, VA.
2007-2009
Director of Informatics/Adjunct Professor of Bioinformatics
Division of Systems Biology, Zhejiang-California International Nanoscience Institute (ZCNI),
Zhejiang University, Hangzhou, P.R. of China
2006 -2009
Description of organizations and positions: (1) IFXworks LLC is a life science informatics consulting company
headquartered in Washington DC area. I am one of the founding members and conducting bioinformatics consulting
service in health IT, next generation sequencing technologies, bioenergy genomics, networks and systems biology
areas. (2) ZCNI is a joint venture of the Institute of Systems Biology, California Nanoscience Institute at UCLA, and
Zhejiang University. Zhejiang University is located in Hangzhou (my Chinese home town) and has been one of the
distinguished top 3 schools in China. My appointment has been a courtesy appointment as adjunct faculty to provide
2
3
consultation and guidance in the establishment of computational cyber-infrastructure for systems biology research in
the institute.
Duty/projects: (1) Basic infrastructure building for IFXworks on next generation sequencing data management,
genomics sequence assembly and analysis, human disease and trait genotype to phenotype association study, data
analysis and processing for plant and microbial bio-energy genomics. (2) Building informatics prototypes for genomics,
transcriptomics, proteomics data analysis, high throughput processing and management. (3) Developing grant proposal
and contact applications to CaBig, health IT technology, and next generation sequence analysis to personalized
medicine and data management.
Computational and Systems Biologist
Virginia Bioinformatics Institute (VBI) and University of Virginia (UVa), VA
2005-2009
Description of organizations and positions: (1) I was a research investigator at VBI, which is a systems biology
organization with strong presence in the field of bioinformatics performing tasks related to networks biology, genome
annotation, transcription profiling, proteomics, and metabolic profiling data management and analysis. (2) I had also
worked at the Center for Academic Computing Health Sciences as well as the W.M. Keck Foundation Center for
Biomedical Mass Spectrometry (UVa research support facility) for bioinformatics collaborative research to faculty
working in systems biology. (3) I was jointly appointed as contract-based research assistant professor affiliated with
Department of Microbiology and is affiliated with Dept. of Biology as research scientist as well. The position has
been dependent on funded grants and cost recovery service fee from collaborating faculty.
Duty/projects: At VBI, I was the project manager for the Administrative Center funded through NIH/NIAID National
Bio-defense Proteomics Program that has 7 Proteomics Research Centers across the nation including Scripps, Harvard
Proteomics Institute, University of Michigan, PNNL, Myriad Genetics etc. to perform : (1) design and implementation
of bioinformatics cyber-infrastructure for genomics, microarray, and proteomics data processing and management
system; (2) data integration of various of public and private proteomics and protein-protein interaction network
knowledge datasets; (3) data analysis and network inference of proteomics data such as 2D gel, mass spectrometry,
Y2H, NMR etc. datasets. At UVa, I was the research faculty collaborating with medical researchers and plant
scientists to perform : (1) proteomics profiling study, data management and analysis, search engine comparison,
algorithm development, high throughput distributed computing for collaborative research with UVa Health System
proteomics scientists; (2) cowpea, common bean, striga, and tobacco genome-scale genespace sequencing, assembly
and annotation, data integration and management; (3) cowpea and tobacco microarray chip design and transcriptional
profiling study, data analysis and management; (4) comparative genome analysis among populus, medicago,
arabidopsis, rice, tobacco, cowpea for species specific gene and pathway discovery; (5) genome-scale bioinformatics
analysis of transcription factors in legume species and construction of transcription factor knowledgebase.
Project Manager of Computational Proteomics/Senior Scientist of Bioinformatics and Chemo-informatics
Myriad Proteomics, Inc. / Prolexys Pharmaceuticals, Inc. Salt Lake City, UT
2002 - 2005
Description of the organization: Prolexys Pharmaceuticals, formerly Myriad Proteomics, is a human proteomics and
drug discovery company. The company is the lead in mapping of genome-scale of human protein-protein interaction
network and was a subsidiary of Myriad Genetics, Inc.
Duty/projects: (1) Construction of automated sequence processing pipeline including raw trace assembly,
vector/adaptor clipping, contamination screen, annotation of raw read, clustering of ESTs from interaction libraries,
domain/motif identification, and mapping of the assembled contigs to human genome, Refseq, and LocusLink; (2)
design and implementation of analysis pipeline, databases and visualization tools for raw sequence data, literature data,
curation data, and in-house human Yeast-Two-Hybrid (Y2H) and mass spectrometry protein-protein interaction and
quantification data; (3) inference on protein-protein interaction network on data from Y2H system, protein fingerprint,
mass spectrometry pull-down assay, and public knowledgebase such as BIND, DIP, YPD etc. (4) data management for
assay data on high throughput small molecular screen for drug discovery on validated targets with IDBS’s ActivityBase
for protocol, template, testset construction. I am a domain expert on ActivityBase ---a major database management
system for drug discovery and HTS. (5) chemical compound management for HTS, chemical compound library
diversity and similarity analysis, compound/library selection, compound clustering and partition.
Project Manager of Computational Genomics /Senior Scientist of Computational Biology
Ceres Inc. in Los Angeles, CA and Monsanto Global Headquarter in St. Louis, Missouri
1997 - 2002
Description of the organizations: Ceres, Monsanto partially owned subdivision with over 250 employee, is an
agricultural functional genomics company. Ceres has one of the best agricultural genomics programs in the world.
Monsanto is a multinational giant agricultural company with business in agricultural productivity, seeds, and genomics.
3
4
Duty/project: (1) Construction of metabolism pathway and protein-protein interaction database for yeast, crop, and
microbial sequence data; (2) transcriptional profiling and genomics data mining and knowledge discovery for gene
lead identification to support agricultural important trait development; (3) software engineering of promoter prediction
and annotation as well as construction of regulatory sequence database; (4) maintenance and enhancement of genomics
sequence process pipeline and annotation with raw trace assembly, gene modeling, promoter prediction, comparative
genomics, and protein family classification; (5) maintenance and enhancement of genome high throughput analysis for
full length cDNA sequences, EST clustering and assembly pipeline, for top 10 business critical species – human,
mouse, dog, pig, soybean, maize, rice, wheat, cotton, arabidopsis; (6) comparative genomics among human and mouse
as well as Arabidopsis, rice, maize, wheat etc. crop genomes; (7) processing over 170 microbial genomes for gene
prediction and inter-species clustering for comparative genomics through Incyte Pathoseq bio-analysis dataflow, and
development of non-homologous based annotation methodologies such as phylogenetic profiling, co-expression
profiling, Rosetta stone pattern, and gene neighboring etc.; (8) a member of Genomics Source Team for strategic
planning for Monsanto agricultural genomics and crop biotechnology; also a member of company wide Research
Hardware Redesign Team for Computer Farm and Networked File Server Optimization; (9) I had been a leading
computational scientist over the years and had managed a group of 3-20 people in the project teams at Monsanto
including Ph.D. level scientists and senior software engineers for curation team, microbial genomics team, and
sequence processing pipeline team etc.. The sizes of my group change due to the complexity and priority of the
projects and business operations.
Selected Latest Publications and Patents (Out of over 30 publications and over 20 patents):
1. Michael P. Timko, Paul J. Rushton, Thomas W. Laudeman, Marta T. Bokoviec, Edmond Chipumuro, Foo Cheung,
Christopher D. Town, and Xianfeng Chen, 2008. Sequencing and Analysis of the Gene-Rich Space of Cowpea.
BMC Genomics. 2008, 9:103.
2. Paul J. Rushton, Marta T. Bokowiec, Thomas W. Laudeman, Jennifer F. Brannock, Xianfeng Chen, and Michael P.
Timko, 2008. TOBFAC: The Database of Tobacco Transcription Factors. BMC Bioinformatics. 2008, 9:53
3. Paul J. Rushton, Marta T. Bokowiec, Shengcheng Han, Hongbo Zhang , Jennifer F. Brannock, Thomas W.
Laudeman, Xianfeng Chen, and Michael P. Timko, 2008. Bioinformatics Analysis on Tobacco Transcription Factors:
Novel Insights into Transcriptional Regulation in the Solanaceae. Plant Physiology. 2008, 147:280-295.
4. Xianfeng Chen et al. (co-authored with about 10 co-inventors), 2008. Expression of microbial proteins in plants for
production of plants with improved properties. United States Patent in Bioinformatics and Biotechnology.
United States Patent Publication Number: 10369493 (US 2003/0233675 A1).
United States Filing Date: 20.02.2003. United States Publication Date and Number: 1.12.2003 (20030233675).
Granted United States Patent Number 7,314,974. Officially Granted Date: 02.05.2008.
5. Xianfeng Chen et al (co-authored with about 20 inventors), 2008. Transgenic Plants with Enhanced Agronomic
Traits. International Patent in Bioinformatics and Biotechnology.
Publication Number.: WO/2008/021543. International Application Number.: PCT/US2007/018368.
Publication Date:21.02.2008. International Filing Date:17.08.2007.
6.Xianfeng Chen, Thomas W. Laudeman, Paul J. Rushton, Thomas A. Spraggins, and Michael P. Timko, 2007.
CGKB: An Annotation Knowledge Base for Cowpea (Vigna unguiculata L.) Methylation Filtered Genomic Genespace
Sequences. BMC Bioinformatics. 2007, 8:129.
7. Guoqing Lu, Liying Jiang, Resa M. Kotalik, Thaine W. Rowley, Luwen Zhang, Xianfeng Chen, and Etsuko N.
Moriyama, 2006. GenomeBlast: A Web Tool for Small Genome Comparison.
BMC Bioinformatics. 2006, 7(Suppl 4):S18.
4
Download