Bioinformatics Education at Harvey Mudd

advertisement
Bioinformatics Education at Harvey
Mudd College
Ran Libeskind-Hadas, Department of Computer Science
Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science)
Our name is Mudd…
• Undergraduate only; 700 students
• Sciences, mathematics, and engineering
Our name is Mudd…
• Undergraduate only; 700 students
• Sciences, mathematics, and engineering
Our name is Mudd…
• Undergraduate only; 700 students
• Sciences, mathematics, and engineering
The HMC Curriculum
Includes one semester of CS
and one of Biology
Electives
Humanities
Major
Core
Experiments in the Core
Semester 1
Semester 2
The “regular” path
Introduction
to CS
Introduction
to Biology
An integrated full
year course
Integrated Introduction to CS and
Biology
A one semester
integrated course
Computation
and Biology
Satisfies CS core requirement
but not the Biology requirement
Introduction
Introduction
to Biology
Introduction
to Biology
to Biology
200 students
per year
20 students
in 2009-2010
… or a second Biology
course
40 students
in 2010-2011
Computation and Biology Core
Course
Objectives:
– Cover the content of the “regular” CS intro course
– Demonstrate the relationship between computing
and biology
– Use computation to teach biology fundamentals
and use biology to motivate computing
fundamentals
– Provide students with computational tools to
perform their own “dry lab” experiments
Computation and Biology Core
Course
Objectives:
– Cover the content of the “regular” CS intro course
– Demonstrate the relationship between computing
and biology
– Use computation to teach biology fundamentals
and use biology to motivate computing
fundamentals
– Provide students with computational tools to
perform their own “dry lab” experiments
Computation and Biology Core
Course
Objectives:
– Cover the content of the “regular” CS intro course
– Demonstrate the relationship between computing
and biology
– Use computation to teach biology fundamentals
and use biology to motivate computing
fundamentals
– Provide students with computational tools to
perform their own “dry lab” experiments
Computation and Biology Core
Course
Objectives:
– Cover the content of the “regular” CS intro course
– Demonstrate the relationship between computing
and biology
– Use computation to teach biology fundamentals
and use biology to motivate computing
fundamentals
– Provide students with computational tools to
perform their own “dry lab” experiments
Course Structure
Assignment
Biologist
Tuesday
C.S.ist
Friday
CSist
Thursday
Weekend
wks 4-5
Introduction to
Python: Data,
functions, and basic
constructs
Population genetics,
molecular evolution
Wks 6-7
DNA, RNA, central
dogma, genes: Case
study of lactose
intolerance
CS
Sequence alignment
Recursion
Wks 8-9
wks 1-3
Biology
Phylogenetics
Recursion on trees and
phylogenetic tree
algorithms
Designing a larger
program, randomness,
simulation
Subset of student HW
Gene finding, gene
expression, lactase
expression
Mitochondrial Eve, diploid
populations with selection, molecular
evolution simulations
Implement alignment and
extend to deal with
substitutions
Implementing a phylogenetic
tree algorithm and making
inferences from the results
wks 10-11
Folding: RNA to
Proteins
Wks 1112
Systems biology and
modeling:
Chemotaxis
Wks 1314
Biology
Topics
CS
RNA folding
algorithm, efficiency,
and memoization
Computation and modeling
Limitations of computation
Subset of student HW
Implement RNA folding and
visualize results
Chemotaxis simulations
and evaluation of models
Capstone Projects
Using computation to teach
biology fundamentals
Population genetic model
Explore effects of drift and selection,
Hardy-Weinberg equilibrium
Using biology to motivate
computation: RNA Folding
Recursion and memoization
Above and Beyond…
Above and Beyond…
Final project example: What
makes cholera pathogenic?

Pathogenic vs. non-pathogenic strains
Final project example: What
makes cholera pathogenic?

Compare all genes in one strain with all in other to find
orthologs (use fast global alignment)
Final project example: What
makes cholera pathogenic?
Programmatically Blast unique proteins to see what they are

Read about these unique genes and explain
what they do
Courtesy of Prof. Russell Schwartz


Some genes
encode for
transcription
factors that
promote or inhibit
the expression of
other genes
Purple is highly
expressed, green is
not expressed
genes
Microarray data…
conditions
Courtesy of Prof. Russell Schwartz
Intuition Behind Network Inference
gene 1
gene 2
gene 3
gene 4
0
0
1
1
1
1
0
1
0
0
1
0
1
1
0
0
1
1
0
1
conditions
+
2
1
4
-
+
2
1
1
-
3
2
-
3
3
+
2
1
-
+
3
2
1
…
3
correlated expression implies
that intuition still leaves a lot of ambiguity
common regulation
Courtesy of Prof. Russell Schwartz
Assuming a Binary Input Matrix

We will assume that genes only have two possible
states: 0 (off) or 1 (on)
gene 1
gene 2
gene 3
gene 4


1
0
0
0
1
1
0
0
conditions
0 0 1 1
0 1 1 1
1 0 0 0
0 0 0 1
1
1
0
0
0
0
1
1
We will also assume that we want to find directionality
but not strength of regulatory interactions
We will exclude the possibility of regulatory cycles:
1
2
4
3
OK
1
2
4
3
NOT OK
The Project



Take binary microarray data as input
Find the acyclic regulatory network with the
highest likelihood
Display the network somehow
Student Response
Likert scale (1 low, 7 high) survey:
“This course stimulated my interest in the subject matter”
College mean:
5.53/7.0 (std. dev 0.80)
Computation and Biology: 6.51/7.0
“I learned a great deal in this course”
College mean:
5.76/7.0 (std. dev 0.72)
Computation and Biology: 6.49/7.0
“Time spent outside of class (per week)”
College mean:
4.98 hours (std. dev 2.42)
Computation and Biology: 6.28 hours
What did students choose to
do the following term?
Students have one elective in the spring term
Took introductory biology:
Took an elective other than CS or biology:
Took an “upper division” biology course:
Took the second CS course:
0/40
0/40
18/40
22/40
Outperformed
their peers
• Students learned the foundational content of
“Intro CS” and “Intro Biology”
• Students’ programs provide rich “dry lab” experiments
and simulations that reinforce understanding of biology
• Students develop general problem-solving and
programming skills (e.g. DP) and have confidence to
solve “new” problems on their own
• Students learned the foundational content of
“Intro CS” and “Intro Biology”
• Students’ programs provide rich “dry lab” experiments
and simulations that reinforce understanding of biology
• Students develop general problem-solving and
programming skills (e.g. DP) and have confidence to
solve “new” problems on their own
• Students learned the foundational content of
“Intro CS” and “Intro Biology”
• Students’ programs provide rich “dry lab” experiments
and simulations that reinforce understanding of biology
• Students develop general problem-solving and
programming skills (e.g. DP) and have confidence to
solve “new” problems on their own
Next steps…
• Increasing student demand for more courses
and even a major in computational biology
• “Mathematical Biology Major” redesigned in
Spring 2011 to “Mathematical and
Computational Biology (MCB)” major
– Good news: 9 MCB majors in sophomore year
(6 Biology majors and 2 Biochemistry majors)
– Bad news: Few faculty in a position to contribute
Beyond the core (intro CS, intro Biology, 3 semesters math,
2 chemistry, 1 physics, …)
Introductory Sequence
• Discrete Math
• Biology laboratory
• Introduction to Mathematical and Computational Biology
Biology Foundations
• Three of: Comparative physiology, ecology and environmental biology, evolutionary
biology, molecular biology
• One biology seminar
• One biology laboratory
Mathematical and Computation Courses
• Intermediate Mathematical Biology
• Computational Biology
• One upper-division math course
• One upper-division CS course
• Three more math and CS courses
Electives, Thesis, Colloquium
• One related elective
• Colloquium
• Senior thesis
Future Plans…
• Refine and improve introductory course
• Write a book for the introductory course
• Collaborate with “sister” institutions to
expand computational biology
curriculum
– New faculty
– New courses
Questions, Comments,
Heckles
Download