Introduction to Python BCHB524 2013 Lecture 4 9/9/2013 BCHB524 - 2013 - Edwards Outline Review Homework #1 Notes Control flow: if statement Control flow: for statement Exercises 9/9/2013 BCHB524 - 2013 - Edwards 2 Review Printing and execution Variables and basic data-types: Functions, using/calling and defining: integers, floats, strings Arithmetic with, conversion between String characters and chunks, string methods Use in any expression Parameters as input, return for output Functions calling other functions (oh my!) If statements – conditional execution 9/9/2013 BCHB524 - 2013 - Edwards 3 Homework #1 Notes Python programs: Writeup: Upload .py files Don't paste into comment box Don't paste into your writeup Upload, don't paste into comment box Text document preferred Don't submit Rosalind solutions 9/9/2013 Rosalind grades recorded separately. BCHB524 - 2013 - Edwards 4 Homework #1 Notes Multiple submissions: OK, but… …I'll ignore all except the last one Make each (re-)submission complete Grading: 9/9/2013 Random grading order Comments Grading "curve" BCHB524 - 2013 - Edwards 5 Homework #1 Notes Exercise 1: Use -1,-2,-3 instead of 0,1,2 Lots of people used [::-1] Serial vs parallel Exercise 2: 9/9/2013 Translation frame (some got it!) Human positions start at 1. BCHB524 - 2013 - Edwards 6 Control Flow: if statement # The input DNA sequence seq = 'atggcatgacgttattacgactctgtgtggcgtctgctggg' # Remove the initial Met codon if it is there if seq.startswith('atg'): print "Sequence without initial Met:",seq[3:] else: print "Sequence (no initial Met):",seq Execution path depends on string in seq. Make sure you change seq to different values. 9/9/2013 BCHB524 - 2013 - Edwards 7 Control Flow: if statement # The input DNA sequence seq = 'atggcatgacgttattacgactctgtgtggcgtctgctggg' # Remove the initial Met codon if it is there if seq.startswith('atg'): initMet = True newseq = seq[3:] else: initMet = False newseq = seq # Output the results print "Original sequence:",seq print "Sequence starts with Met:",initMet print "Sequence without initial Met:",newseq 9/9/2013 BCHB524 - 2013 - Edwards 8 Control Flow: if statement # The input DNA sequence seq = 'atggcatgacgttattacgactctgtgtggcgtctgctggg' # Remove the initial Met codon if it is there initMet = seq.startswith('atg'): if initMet: newseq = seq[3:] else: newseq = seq # Output the results print "Original sequence:",seq print "Sequence starts with Met:",initMet print "Sequence without initial Met:",newseq 9/9/2013 BCHB524 - 2013 - Edwards 9 Control Flow: if statement # The input DNA sequence seq = 'atggcatgacgttattacgactctgtgtggcgtctgctggg' # Remove the initial Met codon if it is there initMet = seq.startswith('atg') if initMet: seq = seq[3:] # Output the results print "Sequence starts with Met:",initMet print "Sequence without initial Met:",seq 9/9/2013 BCHB524 - 2013 - Edwards 10 Serial if statement # Determine the complementary nucleotide def complement(nuc): if nuc == 'A': comp = 'T' if nuc == 'T': comp = 'A' if nuc == 'C': comp = 'G' if nuc == 'G': comp = 'C' return comp # Use print print print print 9/9/2013 the complement function "The complement of A is",complement('A') "The complement of T is",complement('T') "The complement of C is",complement('C') "The complement of G is",complement('G') BCHB524 - 2013 - Edwards 11 Compound if statement # Determine the complementary nucleotide def complement(nuc): if nuc == 'A': comp = 'T' elif nuc == 'T': comp = 'A' elif nuc == 'C': comp = 'G' elif nuc == 'G': comp = 'C' else: comp = nuc return comp # Use print print print print 9/9/2013 the complement function "The complement of A is",complement('A') "The complement of T is",complement('T') "The complement of C is",complement('C') "The complement of G is",complement('G') BCHB524 - 2013 - Edwards 12 If statement conditions Any expression (variable, arithmetic, function call, etc.) that evaluates to True or False Any expression tested against another expression using: == (equality), != (inequality) < (less than), <= (less than or equal) > (greater than), >= (greater than or equal) in (an element of) Conditions can be combined using: 9/9/2013 and, or, not, and parentheses BCHB524 - 2013 - Edwards 13 For (each) statements Sequential/Iterative execution # Print the numbers 0 through 4 for i in range(0,5): print i # Print the nucleotides in seq seq = 'ATGGCAT' for nuc in seq: print nuc Note use of indentation to define a block! 9/9/2013 BCHB524 - 2013 - Edwards 14 For (each) statements # Input to program seq = 'AGTAGTTCGCGTAGCTAGCTAGCTATGCG' # Examine each symbol in seq and count the A's count = 0 for nuc in seq: if nuc == 'A': count = count + 1 # Output the result print "Sequence",seq,"contains",count,"A symbols" 9/9/2013 BCHB524 - 2013 - Edwards 15 For (each) statements # Examine each symbol in seq and count the A's def countAs(seq): count = 0 for nuc in seq: if nuc == 'A': count = count + 1 return count # Input to program inseq = 'AGTAGTTCGCGTAGCTAGCTAGCTATGCG' # Compute count aCount = countAs(inseq) # Output the result print "Sequence",inseq,"contains",aCount,"A symbols" 9/9/2013 BCHB524 - 2013 - Edwards 16 For (each) statements # Examine each symbol in seq and count those that match sym def countSym(seq,sym): count = 0 for nuc in seq: if nuc == sym: count = count + 1 return count # Input to program inseq = 'AGTAGTTCGCGTAGCTAGCTAGCTATGCG' # Compute count aCount = countSym(inseq,'A') # Output the result print "Sequence",inseq,"contains",aCount,"A symbols" 9/9/2013 BCHB524 - 2013 - Edwards 17 Exercise 1 Write a Python program to compute the reverse complement of a codon Modularize! Place the reverse complement code in a new function. Use my solution to Homework #1 Exercise #1 as a starting point Add the “complement” function of this lecture (slide 12) as provided. Call the new function with a variety of codons Change the complement function to handle upper and lower-case nucleotide symbols. 9/4/2013 Test your code with upper and lower-case codons. BCHB524 - 2013 - Edwards 18 Exercise 2 Write a Python program to determine whether or not a DNA sequence consists of a (integer) number of (perfect) "tandem" repeats. 9/9/2013 Test it on sequences: AAAAAAAAAAAAAAAA CACACACACACACAC ATTCGATTCGATTCG GTAGTAGTAGTAGTA TCAGTCACTCACTCAG Hint: Is the sequence the same as many repetitions of its first character? Hint: Is the first half of the sequence the same as the second half of the sequence? BCHB524 - 2013 - Edwards 19