New approaches for determining functional siRNA Liyang Diao Dr. Stanley Dunn, advisor Protein Production • Production of proteins starts with DNA • DNA is in the nucleus • Requires mRNA to finish protein production mRNA: messenger RNA RNAi: RNA interference • Suppresses gene expression • Affects mRNA DNA mRNA http://nobelprize.org/educational_games/medicine/dna/index.html protein More on RNAi siRNA: short-interfering RNA • Typically 20-25 nucleotides long • Double-stranded • Participates in RNAi by degrading mRNA Potential for effective gene therapy Issues • Some genes are more effectively suppressed than others • Mechanism is poorly understood Diagram: http://www.ambion.com/techlib/append/RNAi_mechanism.html Question How do we know which siRNA are functional? Some ideal properties: GC content between 30-55% Low level of secondary structure Differential between thermodynamic stability of 5’ and 3’ ends: A/U content Specific positional nucleotide preferences Avoid long GC stretches http://bioinf.man.ac.uk/resources/phase/manual/RNAMolecule.png Previous Model Pancoska’s Eulerian graph model • Represent a string of siRNA by a directed digraph first • Construct a weighted undirected Eulerian graph A T G C • Compare graphs for functional and non functional siRNA • For these two sets of siRNA, compute graph properties that reflect sequence structure. Issues with Pancoska’s Algorithm A T G C ATTCGTGGACG GATTCGTGGAC CGATTCGTGGA … • Uniqueness • Complex pattern recognition Other Ideas • Number of nucleotide mutations • Levenshtein distance: Measures the minimum number of substitutions/insertions required to go from one string to another. Current/Future Progress • 420 total number of possible siRNA strands of length 20. • How many are potentially functional? • Combinatorics! Math • Let H(n,i,j) be the number of potential positions of A/U, G/C pairs. • • • • Thus, the total number of potential strings is 220 * H(n,i,j). n the total number of G or C nucleotides i the total number of A or U nucleotides at 5’ end j the total number of A or U nucleotides at 3’ end Quantity desired: