Lab 3 2/14/2012 Happy Valentine! Fah Sathirapongsasuti and Emily Kay 1. Note from HW2 a. if __name__ == “__main__”: block not to be left empty b. Function names are case sensitive: Pearson vs pearson c. print vs return d. Boolean True/False vs string “True”/”False” e. Call your functions, e.g. call mean in stdev, call coinflip in coinflipper, call dieroll in dieroller f. float(n)/N not float(n/N) g. Use += h. In dieroller don’t call rolldie twice: if rolldie() == 3 or rolldie() == 6: count += 1 2. Questions about HW3 3. Programming style a. if fMatch == True: vs if fMatch: b. All import’s go at the top 4. File I/O: command line caller AND python receiver a. python script.py < input.txt AND sys.stdin b. python script.py input.txt AND open(sys.argv[1]) c. python script.py < input.txt > output.txt AND sys.stdin/sys.stdout d. python script.py input.txt output.txt AND open(sys.argv[1])/open(sys.argv[2],”w”) e. Exercise: write a script csv2txt.py that takes in a comma separated value (csv) file and converts it to a tab delimited file (txt). Choose an IO style between: i. python csv2txt.py < input.csv > output.txt ii. python csv2txt.py input.csv output.txt 5. Regular Expression a. Mtch = re.search(pattern, string) peek at Mtch and see what it is b. Mtch.group(i) c. Mtch.groups() d. Exercise: write a function findOrf that takes a string of DNA sequence and identify an open reading frame. e. Pattern review: Read BED line (http://genome.ucsc.edu/FAQ/FAQformat#format1) i. mtch = re.search(r‘^chr(\d+|[XY])’, chrom) ii. mtch = re.search(r‘^chr(\d+|[XY])\s(\d)+\s(\d)+’, bedLine) iii. mtch = re.search(r‘^chr(\d+|[XY])\s(\d)+\s(\d)+(\s.+)*$’, longBedLine) f. Exercise: write a function that takes in an input BED file, scan it, and print out all illegal genomic intervals (i.e. BED lines that the end position is less than the start position). The function should have prototype: checkBed(fileBed) ==> print illegal bed lines g. Exercise: Emily’s Mc1r.fasta example (see Mc1r_fasta_example.txt)