Finding an Open Reading Frame (ORF) in a DNA Sequence Teacher Notes Overview and concepts Overview Students create Java classes to parse a DNA sequence to find an Open Reading Frame. Grade level AP Computer Science Java concepts covered ArrayLists, classes, static methods, object methods, passing parameters, returning values from methods. Prior knowledge required Open Reading Frame (ORF) In-frame sequence DNA Nucleotides Codons Java String methods Activity notes Time frame Reading: 30 minutes. Laboratory activity: 90 minutes. Materials Java Java IDE DNA sequence to be used Rules for Open Reading Frames Teaching Tips / Activity Overview 1. Depending on your students’ backgrounds, assign reading so that they will understand about the transcription of an Open Reading Frame from a DNA sequence to messenger RNA followed by translation to a protein. Explain start and stop codons and the rules that determine what section of a DNA sequence constitute an Open Reading Frame. 2. Have students manually identify an ORF in a DNA sequence: 5' GGGATCGATGCCCCTTAAAGAGTTTACATATTGCTGGAGGCGTTAACCCCGGA 3´ 3. Have students code a program to identify an Open Reading Frame in a sequence of DNA: String DNAsequence = "TACGCAATGCGTATCATTCTGCTGGGCGCTCCGGGCGCAGGTAAAGGTACTCAGGCTCAATTCATCATGGAGAAATACGGCATTCCGCAAATCTC TACTGGTGACATGTTGCGCGCCGCTGTAAAAGCAGGTTCTGAGTTAGGTCTGAAAGCAAAAGAAATTATGGATGCGGGCAAGTTGGTGACTGAT GAGTTAGTTATCGCATTACTCAAAGAACGTATCACACAGGAAGATTGCCGCGATGGTTTTCTGTTAGACGGGTTCCCGCGTACCATTCCTCAGGCA GATGCCATGAAAGAAGCCGGTATCAAAGTTGATTATGTGCTGGAGTTTGATGTTCCAGACGAGCTGATTGTTGAGCGCATTGTCGGCCGTCGGGT ACATGCTGCTTCAGGCCGTGTTTATCACGTTAAATTCAACCCACCTAAAGTTGAAGATAAAGATGATGTTACCGGTGAAGAGCTGACTATTCGTAA AGATGATCAGGAAGCGACTGTCCGTAAGCGTCTTATCGAATATCATCAACAAACTGCACCATTGGTTTCTTACTATCATAAAGAAGCGGATGCAGG TAATACGCAATATTTTAAACTGGACGGAACCCGTAATGTAGCAGAAGTCAGTGCTGAACTGGCGACTATTCTCGGTTAATTCTGGATGGCCTTATA GCTAAGGCGGTTTAAGGCCGCCTTAGCTATTTCAAGTAAGAAGGGCGTAGTACCTACAAAAGGAGATTTGGCATGATGCAAAGCAAACCCGGCG TATTAATGGTTAATTTGGGGACACCAGATGCTCCAACGTCGAAAGCTATCAAGCGTTATTTAGCTGAGTTTTTGAGTGACCGCCGGGTAGTTGATA CTTCCCCATTGCTATGGTGGCCATTGCTGCATGGTGTTATTTTACCGCTTCGGTCACCACGTGTAGCAAAACTTTATCAATCCGTTTGGATGGAAGA GGGCTCTCCTTTATTGGTTTATAGCCGCCGCCAGCAGAAAGCACTGGCAGCAAGAATGCCTGATATTCCTGTAGAATTAGGCATGAGCTATGGTTC AC"; 4. Require students to use ArrayLists and objects to accomplish this task. Have them develop the program in steps: a) Parse the string to find stop and start codons. b) Create an object to record the position of the start and stop codon and whether the codon is a start or stop codon. c) Create an ArrayList of these objects. Each element will represent a start or stop codon. d)Identify sequence(s) which start with a start codon, end with a stop codon, contain at least 300 nucleotides, and are in frame. e) Store all sequences that meet this criteria in an ArrayList whose elements consist of objects made up of each sequence, the position of its start codon, and the position of its stop codon. f) Create the appropriate toString methods to print the original sequence and each ORF located in it. Report sequences in line widths of 80. 5. Here is an example of the expected output: * Original Sequence TACGCAATGCGTATCATTCTGCTGGGCGCTCCGGGCGCAGGTAAAGGTACTCAGGCTCAATTCATCATGGAGAAATACGG CATTCCGCAAATCTCTACTGGTGACATGTTGCGCGCCGCTGTAAAAGCAGGTTCTGAGTTAGGTCTGAAAGCAAAAGAAA TTATGGATGCGGGCAAGTTGGTGACTGATGAGTTAGTTATCGCATTACTCAAAGAACGTATCACACAGGAAGATTGCCGC GATGGTTTTCTGTTAGACGGGTTCCCGCGTACCATTCCTCAGGCAGATGCCATGAAAGAAGCCGGTATCAAAGTTGATTA TGTGCTGGAGTTTGATGTTCCAGACGAGCTGATTGTTGAGCGCATTGTCGGCCGTCGGGTACATGCTGCTTCAGGCCGTG TTTATCACGTTAAATTCAACCCACCTAAAGTTGAAGATAAAGATGATGTTACCGGTGAAGAGCTGACTATTCGTAAAGAT GATCAGGAAGCGACTGTCCGTAAGCGTCTTATCGAATATCATCAACAAACTGCACCATTGGTTTCTTACTATCATAAAGA AGCGGATGCAGGTAATACGCAATATTTTAAACTGGACGGAACCCGTAATGTAGCAGAAGTCAGTGCTGAACTGGCGACTA TTCTCGGTTAATTCTGGATGGCCTTATAGCTAAGGCGGTTTAAGGCCGCCTTAGCTATTTCAAGTAAGAAGGGCGTAGTA CCTACAAAAGGAGATTTGGCATGATGCAAAGCAAACCCGGCGTATTAATGGTTAATTTGGGGACACCAGATGCTCCAACG TCGAAAGCTATCAAGCGTTATTTAGCTGAGTTTTTGAGTGACCGCCGGGTAGTTGATACTTCCCCATTGCTATGGTGGCC ATTGCTGCATGGTGTTATTTTACCGCTTCGGTCACCACGTGTAGCAAAACTTTATCAATCCGTTTGGATGGAAGAGGGCT CTCCTTTATTGGTTTATAGCCGCCGCCAGCAGAAAGCACTGGCAGCAAGAATGCCTGATATTCCTGTAGAATTAGGCATG AGCTATGGTTCAC Start codon found at position 6 Start codon found at position 66 Start codon found at position 105 Start codon found at position 162 Start codon found at position 291 Stop codon found at position 648 Start codon found at position 657 Stop codon found at position 666 Stop codon found at position 741 Stop codon found at position 765 Start codon found at position 789 Stop codon found at position 822 Stop codon found at position 834 Stop codon found at position 849 Start codon found at position 888 Stop codon found at position 921 Stop codon found at position 1026 Stop codon found at position 1032 Stop codon found at position 1038 Start codon found at position 1044 ORF start position: 292 ORF end position: 651 ORF Sequence(s) ATGAAAGAAGCCGGTATCAAAGTTGATTATGTGCTGGAGTTTGATGTTCCAGACGAGCTGATTGTTGAGCGCATTGTCGG CCGTCGGGTACATGCTGCTTCAGGCCGTGTTTATCACGTTAAATTCAACCCACCTAAAGTTGAAGATAAAGATGATGTTA CCGGTGAAGAGCTGACTATTCGTAAAGATGATCAGGAAGCGACTGTCCGTAAGCGTCTTATCGAATATCATCAACAAACT GCACCATTGGTTTCTTACTATCATAAAGAAGCGGATGCAGGTAATACGCAATATTTTAAACTGGACGGAACCCGTAATGT AGCAGAAGTCAGTGCTGAACTGGCGACTATTCTCGGTTAA */ Assessment Programs will be graded with respect to the correct output, the correct use of ArrayLists and objects to solve the problem, and the algorithm used to determine ORFs. Extensions Additional sequences can be obtained online and analyzed with the program. ORFs can be translated into messenger RNA. Messenger RNA can be translated into proteins. Acknowledgments This lesson and teacher notes were produced by Cynthia Lang, adapted from 1. University of Wisconsin BioWeb: http://bioweb.uwlax.edu/GenWeb/Molecular/Theory/Translation/Translation_Problems/translation_probl ems.htm 2. Bioinformatics Exercises, developed by Paul Craig, Department of Chemistry, Rochester Institute of Technology: http://media.wiley.com/product_data/excerpt/57/04712149/0471214957-3.pdf