Bioinformatics Topics

advertisement
Commonly Taught Bioinformatics Topics
Derived from Syllabi of 20 Universities
Group 1*: DNA analysis
Sequence Analysis**
Computational Genomics
Protein Structure and Function
Phylogenetics
Microarray Analysis
Group 2: Computing
Programming
Algorithms
Databases
Group 3: Other
General Biology
Biomedical Informatics
Statistics
Law and Ethics
*Group 1 is the most common, followed by 2 and then 3.
**Within each group, the most common topics are listed first
Topics in Sequence Analysis
Background
-General principles of DNA/RNA structure and stability
The Basics of Sequence Alignment
-Pair-wise sequence comparison
-multiple sequence alignment
-fragment assembly
-Sequence profiles and profile alignment methods
-Scoring matrices, BLOSUM
-Sequence weighting
Software
-BLAST, FASTA, CLUSTAL, GRAIL, INSIGHT II, RASMOL, HMMER
Algorithms and Models
-Hidden Markov Models
-Maximum-likelihood estimate
-Markov Chains
-Gibbs Sampling
-Dirichlet Mixtures
Phylogenetic inference (see topics in Phylogenetics)
-reconstructing evolutionary relationships
Structure prediction (see topics in protein structure and modeling)
-RNA secondary structure prediction
-protein structure prediction and comparison
Public Access Databases
-sequence and structure search tools
Whole Genome Sequencing (see topics in computational genomics)
-shotgun approaches
-EST assembly
-genome annotation
Topics in Computational Genomics
Whole Genome Reconstruction
-genome mapping
-genome assembly
Comparative Genomics
-General probabilistic graphical models
-Bayesian Networks
-Exact inference
-Learning BNs from data
-EM and structural EM
Functional Genomics
-QTLs & eQTLs
-Non-coding RNA genes
-RNA recognition
-reverse genetics
-The ENCODE project
Gene Expression Analysis
-sequencing methods (including microarrays)
-clustering
-EST libraries
-motif discovery
-DNA methylation & epigenetic gene regulation
Genome Annotation
-gene finding
-gene indices
Genetic Regulation
-gene regulatory regions
-translational regulation: siRNA and microRNA
-identifying miRNAs and their targets
Genomic Technologies
-PCR technology
-Genechips
Genome Diversity
Genome Structure
The Human Genome Project
Genome Databases
Data Mining Models
-Hidden Markov Models
-Probabilistic formulation
-Mixture models
-Gaussian Mixtures
-Biclustering
-Loss functions
-Conditional maximum likelihood.
-Linear regression, GLMs, perceptrons, Neural nets
-SVMs
Proteomics and Metabolomics
Medical Applications
-Diseases and phenotypes
-Pharmaceutical discovery
-Genomic medicine
-Copy number variation
Phylogenetic Inference
-Parsimony
-Stationary markov processes
-Rate matrices
-Maximum Likelihood
-Maximum a posteriori
-Felsensteins Post-order traversal
-PhyloHMMs
Topics in Protein Structure and Modeling
Physico-chemical properties of proteins
-Protein folding dynamics
Determining Protein Structure
Experimental Techniques
-X-ray Crystallography
-NMR
-cryo-EM
-mass spectrometry
Computational Techniques
-SAM-Txx prediction protocol
-Lattice-based prediction
-Undertaker protein-folding algorithm
Classification of Protein Structure
-Protein families
-Protein domains and prediction of domain boundaries
-Homology modeling
-Comparative modeling of protein structure and threading
-Significance of structure-structure similarity
-Expression data analysis (clustering and classification)
-Structure-structure alignment algorithms
Protein Function
-Protein structure-function relationships
-Prediction of functionally important sites
Public protein structure databases
-Structure database search tools
Protein interactions
-Protein-protein interaction networks
-Voronoi diagram, Delaunay triangulation
RNA secondary structure prediction
Medical Applications
-Protein microarrays and detection of autoimmune disease
-High throughput proteomic disease markers
-Computational methods for protein microarrays
-Remote homology detection
-Proteomic diagnosis of trauma
Topics in Phylogenetics
Molecular basis of evolution
History of Phylogenetic Inference
Characters: Homology, Morphology, Molecular
Phylogenetic Tree Construction
-Alignment Strategies
-Optimality Criteria – Parsimony, ML, ME
-Algorithmic Approaches
-Searching Tree Space
-Character Weighting in Parsimony
-Clustering methods
-Hypothesis Testing: Paired Sites, Parametric Bootstraps
-Multiple Data Sets/Partitioned Models
-Molecular Clocks
-Ancestral Character State Reconstruction
Models of Sequence Evolution
-Model Selection
-Method Performance
Support for Constructed Trees
-Consensus Trees
-G1, PTP, Decay, Bootstrap
-Jacknife & Bayesian Nodal Probabilities
Non-tree Based Methods
Software tools for phylogenetic analysis
Genome comparisons
Protein structure evolution
Topics in Microarray Analysis
Review of the basic biology of gene expression
Overview of microarray technology
Microarray Data Analysis – Statistical Techniques
-regression
-discriminant analysis
-clustering
-classification
-simple graphical models
Methods for computational and biological validation
Topics in Programming
General Programming Concepts (in alphabetical order)
-algorithm design
-arrays
-complex data structures
-control structures
-data types
-debugging
-designing modules
-dynamic programming
-file input/output
-functions
-graphics programming
-hashes
-introduction to machine learning
-multiprocessing & multithreaded programming
-network programming with sockets
-object oriented programming
-pointers
-recursion
-regular expressions
-sorting
-subroutines
-web programming (HTML, CGI)
Languages Taught and Language-Specific Topics
HTML
SQL
Perl
BioPerl
-Genomic resources
-Accessing Remote databases
C
-Flow Control
-C Structures
-Interface to UNIX
-Using a C Application Programming Interface (API)
Java
-Using Java Classes
-GUI Layout
-Java Events
-Java Exception Handling
UNIX
-Using UNIX for basic data processing
-UNIX command-line tools
-UNIX shell programming
-Using UNIX development tools
Intro to relational databases
Bioinformatics Applications
-DNA sequence analysis
-parsing FASTA and GenBank files
-processing BLAST output files
Topics in Algorithms
Concepts of Optimization
-Continuous vs Discrete Optimization
-Constrained and Unconstrained Optimization
-Global and Local Optimization
-Stochastic and Deterministic Optimization
Optimization Algorithms
-Linear and Nonlinear programming
-Combinatorial optimization
-Heuristic search methods
Exact string matching problems
-Suffix trees
-Suffix tree algorithms
Applications
-Prediction of genetic regulatory network
-Protein structure prediction
-Design of microarray experiment, analysis of microarray data
-Biological signal finding
-Neural Networks
Greedy algorithms
Algorithm complexity
Sorting
Recursion
Dynamic programming and space management
Parallel and grid computing
Simulation
Introduction to Machine Learning
Feature Spaces
Hidden Markov Models
SVMs
Topics in Databases
Relational data models and database management systems
-Relational database design
-Foreign Keys
-Relational Integrity
-Entity-Relationship modeling
-Normalization
-Transactions
SQL
-simple queries
-calculated fields
-sorting and grouping results
-aggregate functions
-multi-table queries (inner join, outer join)
-subqueries
-combining result tables
-create and alter tables
ORACLE
Biological Databases
Object-Oriented Databases
Web based programming tools to make databases accessible
Data integration and security
Topics in General Biology
Molecular Biology
-Synthesis, structure, and function of DNA, RNA, and proteins
-Regulation and control of the synthesis of RNA and proteins
-Introduction to molecular biology of eukaryotes.
-molecular biological techniques (genetics, recombinant DNA techniques)
-cell structure and cell cycle
Genetics
-relationships among genes
-regulation of gene expression
-use of genetic systems to probe genetic problems
-Mendelian genetics
-genomics
-rules of inheritance in eukaryotic organisms
-DNA replication
-molecular approaches to analyze DNA.
-DNA structure
-location of DNA within the cell
-movement of genes within a chromosome
-genetic maps
-chromosome abnormalities
-mutations
-prokaryotic genetics
-genetic recombination
-DNA movement in the genome
-protein synthesis
Topics in Biomedical Informatics
Basics of Biomedical Informatics
-Overview of Discipline and Its History
-Biomedical Computing
-Electronic Medical Records (EMR)
-Decision Support and Health Care Quality
-Standards, Privacy and Security, Costs and Implementation
-Evidence-Based Medicine and Medical Decision-Making
-Imaging Informatics and Telemedicine
-Bibliographic Retrieval
-Networking
-Web-based Interactions
Information Retrieval
-Text Based
-Image Based
-Genomics
-Terms, Models, and Resources
-Health and Biomedical Information
-Evaluation of Systems
-Content
-Indexing
-Retrieval
-Evaluation
-Lexical-Statistical Systems
-Augmenting Systems for the User
Health sciences informatics
-Health sciences information centers
-Health information professionals and roles
-Information resources
-Information organization and access
Topics in Statistics
Statistical Foundations
-random variables
-probability
-statistical inference
-confidence intervals
-hypothesis testing
-correlation and regression
Advanced Statistical Concepts
-sample size and power considerations
-analysis of variance and multiple comparisons
-multiple regression and statistical control of confounding
-logistic regression
-survival analysis
-multiple testing issues and step-down procedures
-length model versus stop character for finite strings
-use of log-probability for computations
-constructing a model from data
-training, cross-training, and testing
-Z-scores (Gaussian dist.) and fat tails of extreme-value (Gumbel dist.)
-machine learning
-supervised learning
-dimensionality reduction
-clustering
-decision trees
-maximum entropy
-Bayes’ Rule and its applications
Algorithms, Models, and Processes
-Stochastic Processes (Poisson, Markov, Random Walks)
-Maximum Likelihood, Likelihood Ratios, and Sequential Analysis
-Gibbs Sampler
-Bootstrap Estimation
Biological Applications
-pairwise and multiple sequence alignment
-gene and protein classification
-phylogenetic tree construction.
-high dimension functional genomics data
-gene and motif finding
Programming in R, Perl
Topics in Law and Ethics
-property rights
-privacy and discrimination
-the federal regulatory role
-self-regulatory safeguards
-liability implications for individual/organizational behavior
-policy responses to societal concerns in the U.S. and abroad
Cases studies
-gene therapy
-cloning
-biomaterials in the medical and health sector
-farming and crop modification in the agricultural sector
Download