Lab_3_Note - The Huttenhower Lab

advertisement
Lab 3
2/14/2012 Happy Valentine!
Fah Sathirapongsasuti and Emily Kay
1. Note from HW2
a. if __name__ == “__main__”: block not to be left empty
b. Function names are case sensitive: Pearson vs pearson
c. print vs return
d. Boolean True/False vs string “True”/”False”
e. Call your functions, e.g. call mean in stdev, call coinflip in coinflipper, call
dieroll in dieroller
f. float(n)/N not float(n/N)
g. Use +=
h. In dieroller don’t call rolldie twice:
if rolldie() == 3 or rolldie() == 6:
count += 1
2. Questions about HW3
3. Programming style
a. if fMatch == True: vs if fMatch:
b. All import’s go at the top
4. File I/O: command line caller AND python receiver
a. python script.py < input.txt AND sys.stdin
b. python script.py input.txt AND open(sys.argv[1])
c. python script.py < input.txt > output.txt AND sys.stdin/sys.stdout
d. python script.py input.txt output.txt AND
open(sys.argv[1])/open(sys.argv[2],”w”)
e. Exercise: write a script csv2txt.py that takes in a comma separated value (csv) file
and converts it to a tab delimited file (txt). Choose an IO style between:
i. python csv2txt.py < input.csv > output.txt
ii. python csv2txt.py input.csv output.txt
5. Regular Expression
a. Mtch = re.search(pattern, string)
peek at Mtch and see what it is
b. Mtch.group(i)
c. Mtch.groups()
d. Exercise: write a function findOrf that takes a string of DNA sequence and identify an
open reading frame.
e. Pattern review: Read BED line (http://genome.ucsc.edu/FAQ/FAQformat#format1)
i. mtch = re.search(r‘^chr(\d+|[XY])’, chrom)
ii. mtch = re.search(r‘^chr(\d+|[XY])\s(\d)+\s(\d)+’, bedLine)
iii. mtch = re.search(r‘^chr(\d+|[XY])\s(\d)+\s(\d)+(\s.+)*$’,
longBedLine)
f.
Exercise: write a function that takes in an input BED file, scan it, and print out all illegal
genomic intervals (i.e. BED lines that the end position is less than the start position). The
function should have prototype:
checkBed(fileBed) ==> print illegal bed lines
g. Exercise: Emily’s Mc1r.fasta example (see Mc1r_fasta_example.txt)
Download