What is a Project • Purpose – Use a method introduced in the course to describe some biological problem • How – – – – – Construct a data set describing the problem Define which method to use (Develop method) Train and evaluate method (Compare performance to other methods) • Documentation – Write report in form of a research article (10-15 pages) • • • • • • Abstract Introduction Materials and method Results Discussion References Project list 1. 2. 3. Peptide MHC binding predictions using position specific scoring matrices including pseudo counts and sequences weighting clustering (Hobohm) techniques Peptide MHC binding predictions using artificial neural networks with different sequence encoding schemes Comparative study of PSSM, ANN for peptide MHC binding 1. Analysis on how data size defines predictive performance for PSSM and ANN 4. NN-align, a neural network-based method for motif recognition and peptide-binding prediction 5. 6. 7. Improved protein template identification using hidden Markov models (HMMER) Implementation of HMM Baum-Welsh algorithm Gibbs sampler for MHC class II binding PSSM • Peptide MHC binding predictions using position specific scoring matrices including pseudo counts and sequences weighting techniques – Compare methods for sequence weighting • Clustering vs heuristics – Benchmark (Peters et al 2006) covering some 20 MHC molecules, compare to best other methods NN sequence encoding • Peptide MHC binding predictions using artificial neural networks with different sequence encoding schemes – Benchmark (Peters et al 2006) covering some 20 MHC molecules, compare to best other methods – Compare sequence encoding schemes • Sparse, Blosum, composition, charge, amino acids size,.. Comparative study • Compare methods for MHC peptide binding – PSSM – ANN • Does size matter? • Data: Benchmark by Peters et al 2006 covering some 20 MHC molecules Hidden Markov models • Improved protein template identification using hidden Markov models (HMMER) – Train profile HMM to remote protein fold recognition • Use the Hmmer program to construct profile HMM for selected set of proteins from the CASP8 competition • Use Hmmer model to identify PDB templates for homology modeling for CASP8 targets HMM • Implement Baum-Welsh HMM training – Based on code from Tapas Kanungo HMM toolkit • A tar file can be found at: – http://www.kanungo.com/software/umdhmm-v1.02.tar. • A zip file: – http://www.kanungo.com/software/umdhmm-v1.02.zip. • README file – http://www.kanungo.com/software/umdhmm-v1.02. • Tutorial talks – http://www.kanungo.com/software/hmmtut.ps – http://www.kanungo.com/software/hmmtut.pdf • Test code on un-fair casino example Gibbs sampler • Gibbs sampler approach to the prediction of MHC class II binding motifs – Develop Gibbs sampler to prediction of MHC class II binding motifs – Benchmark Nielsen et al 2007 covering 14 HLA-DR alleles