Introduction to programming for biologists
MA170
5
Biotechnology, 1 st year
1
Tim Downing
9 March 2012
This module provides biology students with foundation programming skills in Perl and enables them to perform core bioinformatic tasks. It will also introduce them to the scope for further learning and more advanced applications, and allow them to appreciate that computer-based tools are fundamental to modern biology and medicine.
To equip students with the capacity to perform basic data manipulation and analysis tasks using a high-level computer programming language.
foundation programming skills in Perl and to perform core bioinformatic tasks
understand the scope for further learning and more advanced applications
appreciate that computer-based tools are fundamental to modern biology and medicine
The module will be delivered as a series of weekly 2-hour lectures based in computer labs for 1 semester (12x2=24 hours). 1 weekly 1 hour tutorial will help maintain progress for students who struggle with material and also to allow keen students to enhance their learning (12x1=12 hours).
Lecture, assessment and class exercise notes will be listed on Blackboard. Student assignments for continuous assessment will be completed individually, requiring 7 hours a week of independent learning (12x5=84 hours).
Module Workload:
Activity Time (hours)
Taught Periods
Formal Examination and Study Periods
Taught Activities
Homework and Study
Total
Examinations and
Study
Overall Total
36
84
120
0
120
The module will be examined through continuous assessment focusing on applying programming to simple biological problems, each of which will account for 1/12th of the overall module marks.
Assessment questions and assignment submission will be completed on Blackboard or via email.
Overall scheme:
Sessions for each week (lecture and tutorial topics):
1.
oneliners; awk; bash vs c-shell; creating scripts; warnings
2.
printing; aliases; standard input/output; opening/closing files
3.
my variables; scalars; strings; arrays; die; if-else; and/or;
4.
undefined variables; while; for; unless; until
5.
pattern matching; motif searching; printf statements
6.
translating DNA/protein sequences; sequence processing
7.
homology searching using scripts to implement tools, accessing websites
8.
debugging; sub-routines; script structure
9.
example application: testing rates of molecular evolution in DNA (eg PAML)
10.
hashes – format, usage and applications
11.
example application: processing high-throughput DNA sequence data
12.
modular programming; online resources; other languages
Example session assignments:
1.
write Perl/awk/shell commands to denote columns in a fasta text file
2.
read in, sort based on user preference, and print to file a set of genes
3.
denote the amino acid length and present of stop codons in a DNA sequence
4.
do pairwise alignments for all possible sequences in a file
5.
determine species given unknown sequence
6.
guess likely function of gene using homology searches
7.
assess variation at the Mc1r gene in vertebrates
8.
use sub-routines to infer mutation type in a protein sequence
9.
what primate species has the fastest evolving Foxp2 gene?
10.
compare hash- and array-based scripts on computer time taken
11.
determine origin of unknown high-throughput sequence library.
12.
incorporate simple BioPerl/CPAN module to execute homology searches