Genome Scale PCR Infidelity Search (The Promiscuous Primer Problem) Goal: An efficient search for the presence of potential undesired PCR products that scans through 3 billion bases of human DNA. Conditions for Mispriming ► 1. First 3 bases on 3’ end of primer must perfectly complement 3 bases in sequence. ► 2. Melting temperature (Tm) of the potential duplex > x (annealing temperature of PCR reaction – y). ► 3. The position of the mismatch(es) between the primer and genomic DNA will affect the binding of the polymerase, so must account for differential binding based on the position of the mismatch(es) (Real-Time PCR studies will address this). ► 4. Forward and reverse primers must be within z bases of each other. Central Issues of Infidelity Search 1. Size of human genome: ~ 3 billion base pairs. Run Blast search locally or presort genome as prefix tree and write search program. 2. Tm Calculations are computationally expensive to repeat billions of times. Precalculate all sequences that will misprime rather than calculate Tms “on the fly.” The Algorithm Tm DH DS R ln CT DTm Tm1 DH dHT dHT ' DS K dST dST ' We know the primer Tm, DH, and DS. Scan all possible single mutations, and calculate their Tms going from most to least stable. Repeat process for two mutations, etc. Once we hit a threshold value, we know the primer will not bind with anything else. The Search After calculating all the possible sequences that will misprime with our primer, we can then do a string search. The ideal answer: 1. Run a batch Blast locally with all the possible sequences. 2. If a forward hit and reverse hit are < x bases apart, we say that is a misprime! The hard answer: 1. Write a search program of the presorted genome and hold misprime indices to compare forward and reverse primers; then check to see if they exist < x bases apart. A AA AAA AAC AC AAG AG AAT AT