SUMMER RESEARCH: THE SUPERSTRING PROBLEM Charles Mullins DIMACS Biomaths Conference April 30, 2005 THE SUPERSTRING PROBLEM Human genome consists of billions of bases: A,C,G,T Current technology can only sequence “short” strings from 500-1000 bases Genome is cut into smaller strings that are sequenced How to recover the original superstring A SUPERSTRING CONTAINS ALL THE ORIGINAL STRINGS Occam’s razor Nature is efficient LOOK FOR SHORTEST SUPERSTRING SS! Greedy Algorithm: proceed pairwise to get maximal overlap at each “stage” Greedy doesn’t always give SS HOW GOOD IS “GREEDY?” Early results proved resulting SS was never worse than 3 times as long This factor was slowly reduced by others Our mentor Elizabeth “Z” Sweedyk obtained a factor of 2.5 EXAMPLE OF GREEDY XABAB ABABY BABA FIRST, SECOND: ABAB FIRST, THIRD: BAB SECOND,THIRD: ABA REPLACE FIRST PAIR WITH XABABY XABABY,BABA YIELD XABABYBABA SS IS XABABABY Our research considered strings consisting of m zeros followed by n ones followed by p zeros: 01100 000111100 etc Key result: Greedy gives SS CONJECTURE In general, “Greedy” will never produce a result more than twice the length of a shortest superstring TEACHING RESEARCH METHODS AT ASMSA Charles Mullins Arkansas School for Mathematics, Science and the Arts Hot Springs AR 71910 Mullinsc@asmsa.org Topics Research Through Technology Junior FIRM Senior FIRM RTT Required course for all entering juniors Fall semester Objectives in: Technology Science Math Writing Technology objectives Learn to use: TI calculator GraphLink & TI-Interactive Office E-mail, Web, HTML Turnitin.com Math Objectives: Get introduced to : Regressions and data modeling Probability Descriptive statistics Inferential statistics Structure Introductory lessons & activities Four mini projects 1. 2. 3. 4. The Ideal Weight The Dubl Stuf Dilemma Pop Off M & Ms Science Objectives Learn: How to design & do experiments How to present & model data Writing objectives Learn: Our lab report format & style How to paraphrase & cite How to integrate data, graphs, equations, etc. Text http://165.29.91.7/math/Rizzle/Final.pdf PDF-formatted copy of the text we wrote for RTT Scheduling All our classes meet 3 times per week Monday all 7 classes for 55 mins Tuesday periods 1 - 4 for 75 mins Wed. periods 5 - 7 for 75 mins. Thur & Fri are repeats but for 90 mins. Scheduling Gives us Tues. & Wed. afternoon w/o classes Tuesday for Junior FIRM Wednesday for senior FIRM 2 hour blocks to work with our students on their projects Junior FIRM Prelude during November Faculty post database of problem statements and interest areas Students review database Choose faculty ideas they like Formulate their own that overlaps w/ faculty interest Project matching Students interview w/ chosen faculty to: Compete for a faculty-chosen problem Sell their idea to a mentor Goals: Match each junior w/ mentor by end of Jan. Distribute juniors, 5 per teacher Assignments Be ready to start experiment on 1 June Formulate problem statement & hypothesis (design goal) Collect sources & start bibliography Study background science Start thinking about required materials Plan experimental techniques Assignments Critique seniors project displays and oral presentations Present their planned experiment to a panel of faculty & seniors Summer work Ideally they should start their experiment if possible Minimum requirement is to be ready to start in August Senior FIRM More of the same Continue to study background Refine method Collect data, obtain results, & draw a conclusion Early Dec. deadline for preliminary results Cooperation All writing assignments submitted to mentor and in composition class Graded by differing criteria: Mentor looks for quality science Comp. teacher looks at writing Math teachers help w/ statistics End products Science paper Project display for science fair Oral presentation Junior Academy of Science Benefits Students leave school: with lab skills knowing how to write lab reports Knowing how to present results Students do well in state and international science fairs Science fair We have enough students to have our own ISEF-affiliated regional fair Must have 50 students $500 affiliation fee Must send at least one finalist and adult to International fair. ACKNOWLEDGEMENTS The presentation on implementing research at ASMSA was first given at the NCSSSMST Expedition 2005 conference in St. Louis, March 9-12, 2005, by my colleagues, Dr. Brian Monson, Dept of Science Chair, and Bruce Turkal, Dept of Mathematics