Introduction to programming for biologists

advertisement

Introduction to programming for biologists

Module Title

MA170

Module Code

5

Credits

Biotechnology, 1 st year

Courses and year of study for which module is designed

Semester(s) in which module is taught

1

Tim Downing

Module Coordinator

9 March 2012

Date submitted to LTA Committee

Module Descriptor

Module Overview:

This module provides biology students with foundation programming skills in Perl and enables them to perform core bioinformatic tasks. It will also introduce them to the scope for further learning and more advanced applications, and allow them to appreciate that computer-based tools are fundamental to modern biology and medicine.

General Aims:

To equip students with the capacity to perform basic data manipulation and analysis tasks using a high-level computer programming language.

Learning Outcomes:

On successful completion of this module, students should be able to

 foundation programming skills in Perl and to perform core bioinformatic tasks

 understand the scope for further learning and more advanced applications

 appreciate that computer-based tools are fundamental to modern biology and medicine

Module Delivery:

The module will be delivered as a series of weekly 2-hour lectures based in computer labs for 1 semester (12x2=24 hours). 1 weekly 1 hour tutorial will help maintain progress for students who struggle with material and also to allow keen students to enhance their learning (12x1=12 hours).

Lecture, assessment and class exercise notes will be listed on Blackboard. Student assignments for continuous assessment will be completed individually, requiring 7 hours a week of independent learning (12x5=84 hours).

Module Workload:

Activity Time (hours)

Taught Periods

Formal Examination and Study Periods

Taught Activities

Homework and Study

Total

Examinations and

Study

Overall Total

36

84

120

0

120

Module Assessment:

1

The module will be examined through continuous assessment focusing on applying programming to simple biological problems, each of which will account for 1/12th of the overall module marks.

Assessment questions and assignment submission will be completed on Blackboard or via email.

Overall scheme:

Sessions for each week (lecture and tutorial topics):

1.

oneliners; awk; bash vs c-shell; creating scripts; warnings

2.

printing; aliases; standard input/output; opening/closing files

3.

my variables; scalars; strings; arrays; die; if-else; and/or;

4.

undefined variables; while; for; unless; until

5.

pattern matching; motif searching; printf statements

6.

translating DNA/protein sequences; sequence processing

7.

homology searching using scripts to implement tools, accessing websites

8.

debugging; sub-routines; script structure

9.

example application: testing rates of molecular evolution in DNA (eg PAML)

10.

hashes – format, usage and applications

11.

example application: processing high-throughput DNA sequence data

12.

modular programming; online resources; other languages

Example session assignments:

1.

write Perl/awk/shell commands to denote columns in a fasta text file

2.

read in, sort based on user preference, and print to file a set of genes

3.

denote the amino acid length and present of stop codons in a DNA sequence

4.

do pairwise alignments for all possible sequences in a file

5.

determine species given unknown sequence

6.

guess likely function of gene using homology searches

7.

assess variation at the Mc1r gene in vertebrates

8.

use sub-routines to infer mutation type in a protein sequence

9.

what primate species has the fastest evolving Foxp2 gene?

10.

compare hash- and array-based scripts on computer time taken

11.

determine origin of unknown high-throughput sequence library.

12.

incorporate simple BioPerl/CPAN module to execute homology searches

2

Download