PYROSEQUENCING Genome Sequencing Utilizing Light-Emitting Luciferase and PCR-Reaction-Mixture-in-Oil Emulsion. Mr. Meir Shachar Dr. Edwin Ginés-Candelaria Introduction* • • • • Read lengths are around 200-300 bases. 400,000 reads of parallel sequencing 100mb of output per run Run time 7.5 hours *Unless otherwise stated, read and output data are provided on the 454 FLX 20 sequencer Step 1: Preparation of the DNA • DNA is fragmented by nebulization • The DNA strand’s ends are made blunt with appropriate enzymes • “A” and “B” adapters are ligated to the blunt ends using DNA ligase • The strands are denatured using sodium hydroxide to release the ssDNA template library (sstDNA). The Adapters • The A and B adapters are used as priming sites for both amplification and sequencing since their composition is known. • The B adapter contains a 5’ biotin tag used for mobilization. • The beads are magnetized and attract the biotin in the B adaptors. Filtering the Mess • There are four adaptor combinations that are formed from the ligation. • A---sequence---A • A---sequence---B • B---sequence---A • B---sequence---B Step 2: Cloning of the DNA (emPCR) • Using water-in-oil emulsion, each ssDNA in the library is hybridized onto a primer coated bead. • By limiting dilution, an environment is created that allows each emulsion bead to have only one ssDNA. • Each bead is then captured in a its own emulsion micro-reactor, containing in it all the ingredients needed for a PCR reaction. • PCR takes place in each of these beads individually, but all in parallel. • This activity as a whole is emPCR. Post emPCR • The micro-reactors are broken, and the beads are released. • Enrichment beads are added (containing biotin); these attach to DNA rich beads only. • A magnetic field filters all DNA rich beads from empty beads, and then extracts the biotin beads from the DNA rich beads. • The DNA in the beads are denatured again using sodium hydroxide, creating ssDNA rich beads ready for sequencing. Step 3: Sequencing • Utilizing the A adapter, a primer is added to the ssDNA. • The beads are now loaded into individual wells created from finely packed and cut fiber-optics (PicoTiterPlate device). • The size of the wells do not allow more than one ssDNA bead to be loaded into a well. • Enzyme beads and packing beads are added. Enzyme beads containing sulfurase and luciferase, and packing beads used only to keep the DNA beads in place. • Above the wells is a flow channel, passing nucleotides and apyrase in a timed schedule. PYROSEQUENCING The Chemical Chain • The nucleotide bases are added in a timed fashion (beginning with A, T, G, C with 10s between each nucleotide and a successive apyrase wash, followed by the next nucleotide.) • As a bi-product of incorporation, DNA polymerase releases a pyrophosphate molecule (PPi). • The sulfurylase enzyme converts the PPi into ATP PYROSEQUENCING The Fireworks Show • Each ATP produced by sulfurilase is used by luciferase. • Luciferase hydrolyzes each ATP molecule to produce oxy-luciferin and light from the substrate luciferin. • Luciferin + ATP + O2 (luciferase) AMP + oxy-luciferin + PPi + CO2 + light • A CCD camera records the light from the reaction. • A wash of apyrase is released after each nucleotide to remove the unincorporated nucleotides. PYROSEQUENCING QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. PYROSEQUENCING Step 4: Data analysis • The intensity of the light emitted by luciferase is proportional to the number of nucleotides incorporated. • Therefore, if the intensity of a single read is 3 times the intensity of a previous read, there are 3 times the amount of incorporated nucleotides in the second read. Two Types of Analysis • Run Time Analysis: • Image acquisition – raw image • Image processing – mapping of raw image to corresponding wells • Signal processing – the individual well signals incorporated into a flowgram • Post-run Processing (separate computer): • Assembly – overlaps multiple reads to create larger reads; assembling a consensus read. • Mapping – maps the reads onto the consensus obtained from the assembly to “re-sequence” the genome. • Amplicon Variant Analysis – compares the sample reads to referenced known sequences for identification. The Titanium model • Read lengths of 400-600 base pairs. • 400-600 million base pairs read per run. • About 100 million parallel reads Additional Links • • • • 454 life sciences: • www.454.com Detailed overview of the system: • http://www.454.com/products-solutions/multimediapresentations.asp Pyrosequencing animation: • http://www.youtube.com/watch?v=bFNjxKHP8Jc&feature=rel ated • http://www.pyrosequencing.com/DynPage.aspx?id=7454 Sequencing step animation: • http://www.youtube.com/watch?v=kYAGFrbGl6E