roche.qxp:GSFLX_Biblio_05051347001.qxd 4/29/08 8:59 PM Page 1 Learn more about the most published next generation sequencing platform. Bibliography April 18, 2008 Genome Sequencer FLX System More Flexibility, More Applications, More Publications www.roche-applied-science.com roche.qxp:GSFLX_Biblio_05051347001.qxd 4/29/08 8:59 PM Page 2 Genome Sequencer FLX System More flexibility, more applications The Genome Sequencer FLX System is the newest addition to Roche’s next-generation sequencing platform, building upon proven technology and peer-reviewed publications. ■ Generate over 100 million bases per 7.5-hour instrument run. ■ Achieve longer reads, averaging 250 to 300 bases. ■ Attain higher throughput with over 400,000 reads per run. ■ Generate single-read accuracy that is greater than 99.5% over 200-base reads. ■ Benefit from consensus accuracy that is greater than 99.99%. The Genome Sequencer FLX System features complete software packages, including easy-to-use graphical user interfaces (GUI) for mapping, assembly, and amplicon variation detection. ■ Use the project management software tool to group multiple instrument runs for analysis, and add instrument runs over time for additional analysis power. ■ Obtain higher-confidence results using next-generation algorithms — from base calling to assembly and mapping, along with variation analysis. Perform breakthrough science with more applications using the flexibility of the Genome Sequencer FLX System. ■ De Novo Sequencing Supported by Shotgun and Paired-End Applications ■ Comparative Genomics using Whole Genome Sequencing ■ Amplicon Resequencing ■ Transcriptome Analysis ■ Gene Regulation Studies ■ ChIP-Sequencing Genome Sequencer FLX System 1| roche.qxp:GSFLX_Biblio_05051347001.qxd 4/29/08 8:59 PM Page 3 Genome Sequencer FLX System Workflow One fragment = One bead = One read The Genome Sequencer FLX System provides a complete solution — from sample preparation through digital data analysis. 1) Sample Input: The Genome Sequencer FLX System supports the |——————————— x 400,000 + ———————————| sequencing of samples from a variety of starting materials, including genomic DNA, PCR products, BACs, and cDNA. 2) Sample Fragmentation: Samples such as genomic DNA and BACs are fractionated into small 300 to 800 base-pair fragments. For some samples, such as small non-coding RNA, fragmentation is not required. Short PCR products can be amplified using Genome Sequencer fusion primers to go directly to Step 4, shown below. 3) Adaptor Ligation: Using a series of standard molecular biology techniques, short adaptors (A and B) – specific for both the 3' and 5' ends – are added to each fragment. The adaptors will also be used for purification, amplification, and sequencing steps. Single-stranded fragments, shown here, are used in subsequent steps in the workflow. 4) One Fragment = One Bead: The first step in emulsion PCR is shown. The adaptors enable hundreds of thousands of single-stranded fragments to bind to their own unique beads. The beads are then encapsulated into individual micelles formed by a water-in-oil emulsion, creating a micro-reactor containing one bead with one unique fragment. Each unique fragment is amplified without the introduction of competing or contaminating sequences. The entire fragment collection is amplified in parallel. 5) Clonally Amplified Bead: The PCR process amplifies each fragment to a copy number of several million per bead. Subsequently, the emulsion PCR is broken while the fragments remain bound to their specific beads. After enrichment, the clonally amplified bead is ready to load onto the PicoTiterPlate device for sequencing. 6) Flowgram: A digital output of the real-time sequencing-by-synthesis. Each flowgram represents the sequence read from one bead which has one unique amplified fragment attached to it. The standard Genome Sequencer FLX System run produces over 400,000 flowgrams or reads per 7.5-hour instrument run. Flowgrams can be exported as .SFF files for submission to NCBI. Figure 1: Genome Sequencer FLX System Workflow Overview. www.genome-sequencing.com |2 roche.qxp:GSFLX_Biblio_05051347001.qxd 4/29/08 8:59 PM Page 4 Sequencing Chemistry The Technology 454 Sequencing, based on sequencing-by-synthesis (Figure 1), is the power behind the performance of the Genome Sequencer FLX System. Sequencing-by-Synthesis: Using an enzymati- Nucleotides are flowed sequentially in a fixed order across the PicoTiterPlate device during a sequencing run. During the nucleotide flow, hundreds of thousands of beads each carrying millions of copies of a unique single-stranded DNA molecule are sequenced in parallel. If a nucleotide complementary to the template strand is flowed into a well, the polymerase extends the existing DNA strand by adding nucleotide(s). Addition of one (or more) nucleotide(s) results in a reaction that generates a light signal that is recorded by the CCD camera in the instrument. cally coupled reaction, light is generated when individual nucleotides are incorporated. Hundreds of thousands of individual DNA fragments are The signal strength is proportional to the number of nucleotides incorporated in a single nucleotide flow. sequenced in parallel. Multiplex Identifier Kits (MID) Increasing sample throughput per instrument run Uniquely tag up to twelve samples with new ligation Multiplex Identifiers (MIDs), reducing the cost per sample, increasing multiplexing capabilities, and simplifying the handling of multiple samples. Twelve MID adaptors are available, each containing a unique 10-nucleotide sequence. During library preparation, sample-specific MID adaptor molecules are blunt-end ligated onto each DNA fragment in a sample. The MID-tagged samples can be pooled for simultaneous amplification and sequencing. Used in conjunction with the available gasket, up to 192 samples can be sequenced per run. Each MID is recognized by the GS FLX analysis software, allowing for automated grouping and analysis of MID-containing reads. 3| ■ Simplify sample preparation by eliminating the need for individual amplification reactions. ■ Expand experimental design options by multiplexing samples. ■ Obtain more reads per sample by reducing the need for gaskets, making more PicoTiterPlate wells available for sequencing. ■ Reduce the risk of contamination by tagging each DNA molecule with a sample-specific MID. roche.qxp:GSFLX_Biblio_05051347001.qxd 4/29/08 8:59 PM Page 5 Sequencing Kits for the Genome Sequencer FLX System Sequencing PicoTiterPlate Kit Device GS LR70 70 x 75 GS SR70 70 x 75 GS LR25 25 x 75 Number of Regions per Gasket Read Length (bases) 2 large regions 4 medium regions 8 medium regions 16 small regions 2 large regions 4 medium regions 8 medium regions 16 small regions 240 240 240 240 100 100 100 100 1 medium region 4 small regions 240 240 Run Time Number of Reads per Region 7.5 hrs 4 hrs 7.5 hrs Number of Bases per Region Total Number or Reads per Gasket Total Number or Bases (all regions) 210,000 70,000 30,000 12,000 210,000 70,000 30,000 12,000 50.4 Mb 16.8 Mb 7.2 Mb 2.88 Mb 21 Mb 7 Mb 3 Mb 1.2 Mb 420,000 280,000 240,000 192,000 420,000 280,000 240,000 192,000 100.8 Mb 67.2 Mb 57.6 Mb 46.08 Mb 42 Mb 28 Mb 24 Mb 19.2 Mb 70,000 12,000 16.8 Mb 3 Mb 70,000 48,000 16.8 Mb 24 Mb Table 1. Number or reads and bases per PicoTiterPlate device configuration. Data was generated using genomic DNA from E. coli. Results may vary depending upon type of sample and organism, average read length observed for a specific genome, or application. (Mb = million bases) Recommended Applications Applications for GS LR70 Applications for GS SR70 Applications for GS LR25 ■ Whole genome sequencing ■ Small RNA ■ Titrations ■ Metagenomics ■ Expression profiling ■ Amplicons for resequencing ■ Long amplicon resequencing ■ Short amplicons (up to 100 pairs) ■ BACs ■ Shotgun sequencing of small genomes (virses, small bacteria, etc.) Configure the PicoTiterPlate Device using various gaskets More flexibility with different sample loading options For projects that require a higher number of samples with fewer sequencing reads per sample, a series of gaskets are provided for both the small and large PicoTiterPlate devices. These gaskets provide physical barriers to divide the area on the PicoTiterPlate device into smaller segments. When the gaskets are combined with the MID’s, a larger number of samples can be sequenced in one instrument run. Figure Legend: PicoTiterPlate and gaskets for GS LR25 Figure Legend: PicoTiterPlate and gaskets for GS LR70 and reagents supporting 1 and 4 samples GS SR70 reagents supporting 1/2, 4, 8 , and 16 sample configurations. |4 roche.qxp:GSFLX_Biblio_05051347001.qxd 4/29/08 9:00 PM Page 6 Genome Sequencer FLX System Workflow and Bioinformatics T Image Capture ▼ Data Processing A C G T Data Transfer to External PC ▼ Data Processing Image Files: 12-15 gigabytes per run Genome Sequencer FLX Instrument Genome Sequencer FLX Instrument ■ Linux server unit and software for running the instrument (Uninterruptible Power Supply [UPS] unit can run the instrument for up to 20 minutes in the event of power failure) ■ Software package on CD, which must be installed on an external PC.1 Package includes: — Data processing software: image and signal processing, base calling, and quality scoring — Application software External PC and Data Storage Requirements (not supplied) Computer requirements 64-bit dual processor (dual x86 CPU or better), 8 GB RAM, 500 GB hard drive or 32-bit dual processor, 4 GB RAM Operating system Red Hat Enterprise Linux 4 Workstation OS (32-bit and/or 64-bit version) WS Standard Java 1.5.0 or higher 5| 1 2 3 4 5 Ethernet 1 Gigabit Ethernet Raw data output (image files) 12-15 gigabytes of data per run using a 70x75 PicoTiterPlate device Processed data output (Data Analysis Folder) Up to 3 gigabytes per run using a 70x75 PicoTiterPlate device Data storage requirement 2 (for data backup) Backup of raw data such as image files is recommended (e.g., 400 gigabytes per tape drive). Raw data can be reprocessed using software updates. There is no limitation to the number of installations in external PCs, and there are no registration codes. Data storage requirement for the analyzed data depends on the number and type of analyses and project size. GS Reference Mapper - amount of time required is comparable to that needed for GS De Novo Assembler for the same amount of data. The time needed for data analysis using the different software tools strongly depends on project size and amount of data. Approximately 30,000 reads are mapped from 11 amplicons in 1.5 minutes. roche.qxp:GSFLX_Biblio_05051347001.qxd 4/29/08 9:00 PM Page 7 Applications De Novo Assembly Mapping Amplicon Analysis Flowgram (read sequence) Application Software ■ GS Reference Mapper ■ GS De Novo Assembler ■ GS Amplicon Variant Analyzer Application Software Data output GS De Novo Assembler ■ De novo assembly of whole genomes (up to 120 megabases) • fna.file (sequence of contigs, FASTA format) ■ Assembly of approximately 400,000 reads in 10 - 15 minutes ■ Addition of 20 - 100 bp paired-end reads to order and orient the contigs into scaffolds • qual.file (corresponding Phred equivalent quality score) • ace.file (alignment of the reads to contig sequences) GS Reference Mapper ■ Resequencing of microbial and eukaryotic genomes (up to 3 gigabases) 3,4 ■ A given reference sequence in FASTA.file format can be compared with the sequence results ■ High-confidence sequence differences are listed • fna.file (sequence of contigs, FASTA format) • qual.file (corresponding Phred equivalent quality score) • ace.file (consensus alignment of the reads against a given reference sequence) GS Amplicon Variant Analyzer ■ Sequencing of PCR products with ultra-deep coverage depth (e.g., identification of somatic mutations) 4,5 • ace.file (alignment of the reads against a reference sequence) ■ Identification and quantification of sequence variations ■ Calculation of frequency for known variants and discovery of new variants • png.file (graphical file format, tab delimited text file) Data output compatibility ■ Data format is compatible with different publicly available tools (e.g., consed and data archive such as NCBI). ■ For applications (e.g., metagenomics) not supported by our software modules, publicly available tools can be used by bioinformatics specialists. |6 roche.qxp:GSFLX_Biblio_05051347001.qxd 4/29/08 9:00 PM Page 8 3K Long-Tag Paired End Sequencing with the Genome Sequencer FLX System The 3K Long-Tag Paired End library DNA fragments contain an approximately 250-bp fragment with a 44-mer adaptor sequence in the middle, flanked by 100-mer sequences, on average. The two flanking 100-bp sequences are segments of DNA that were originally located approximately 3 kb apart in the genome of interest. This method requires no cloning and generates 150,000 to 200,000 paired-end reads from a single Genome Sequencer FLX instrument run with a total elapsed time – from genomic DNA sample preparation to sequencing results – of less than four days. This new protocol for generating a library of paired-end fragments can be used to determine the orientation and relative positions of contigs produced by de novo shotgun sequencing and assembly. This 3K Long-Tag Paired End protocol (Fig. 1) can also be used to identify genomic structural variations and their associated breakpoints. Structural variation of the genome, involving large, kilo- to mega-base sized deletions, duplications, insertions, inversions, and complex combinations of rearrangements, is widespread in humans and presumably responsible for a considerable amount of phenotypic variation.1 Figure Legend: Description of the 3K Long-Tag Paired End Sequencing protocol. Genomic DNA is randomly fragmented and subsequently circularized via adaptors. After circularization, the DNA is randomly fragmented, followed by enrichment for linker-positive fragments via streptavidin-bead (SA) capture. Long Paired End Adaptors (A and B; recipe provided in the GS FLX Paired End DNA Library Preparation Method Manual) are then ligated onto the linker-positive fragment ends. Fragments that contain the A and B Adaptors are captured via streptavidin beads (SA) for emPCR amplification and subsequent 454 SequencingTM. REFERENCE 1. Korbel, J.O. et al. Paired-End Mapping Reveals Extensive Structural Variation in the Human Genome. Science 318, 420-426 (2007). 7| roche.qxp:GSFLX_Biblio_05051347001.qxd 4/29/08 9:00 PM Page 9 Bibliography of peer reviewed articles using the Genome Sequencer Platform HIV Studies Hoffmann, C. et al. “DNA bar coding and pyrosequencing to identify rare HIV drug resistance mutations.” Nucleic Acids Research 35 (13), e91 (2007). Wang, C. et al. “Characterization of mutation spectra with ultra-deep pyrosequencing: Application to HIV-1 drug resistance.” Genome Research 17 (8), 1195-1201 (2007). Wang, G.P. et al. “HIV integration site selection: Analysis by massively parallel pyrosequencing reveals association with epigenetic modifications.” Genome Research 17 (8), 1186-1194 (2007). BACs/Plastids/Mitochondria/etc. Baker, S. et al. “A novel linear plasmid mediates flagellar variation in Salmonella Typhi.” PLoS Pathogens 3 (5), e59 (2007). Cai, Z. et al. “Complete plastid genome sequences of Drimys, Liriodendron, and Piper: implications for the phylogenetic relationships of magnoliids.” BMC Evolutionary Biology 6, 77 (2006). Cuadros-Orellana, S. et al. “Genomic plasticity in prokaryotes: the case of the square haloarchaeon.” The ISME Journal 1 (3), 235–245 (2007). Jex, A.R. et al. “Using 454 technology for long-PCR based sequencing of the complete mitochondrial genome from single Haemonchus contortus (Nematoda).” BMC Genomics 9, 11 (2008). Meyer, M. et al. “Targeted high-throughput sequencing of tagged nucleic acid samples.” Nucleic Acids Research 35 (15), e97 (2007). Moore, M.J. et al. “Rapid and accurate pyrosequencing of angiosperm plastid genomes.” BMC Plant Biology 6, 17 (2006). Wicker, T. et al. “454 sequencing put to the test using the complex genome of barley.” BMC Genomics 7, 275 (2006). View a selection of these publications and learn more about the Genome Sequencer FLX System at www.genome-sequencing.com NEW Metagenomics and Microbial Diversity Angly, F.E. et al. “The marine viromes of four oceanic regions.” PLoS Biology, 4 (11), e368 (2006). Cox-Foster, D.L. et al. “A metagenomic survey of microbes in honey bee colony collapse disorder.” Science 318 (5848), 283-287 (2007). NEW Desnues C. et al. “Biodiversity and biogeography of phages in modern stromatolites and thrombolites.” Nature, 2008 Mar 2 [Epub ahead of print]. NEW Dinsdale E. A. et al. “Microbial ecology of four coral atolls in the northern line islands.” PLoS ONE, 3 (2):e1584 (2008). NEW Dinsdale E. A. et al. “Functional metagenomic profiling of nine biomes.” Nature, 2008 Mar 12 [Epub ahead of print] NEW Dowd S. E. et al. “Survey of bacterial diversity in chronic wounds using Pyrosequencing, DGGE, and full ribosome shotgun sequencing.” BMC Microbiology, 8 (1), 43 (2008). Edwards, R.A. et al. “Using pyrosequencing to shed light on deep mine microbial ecology.” BMC Genomics 7, 57 (2006). Huber, J.A. et al. “Microbial population structures in the deep marine biosphere.” Science 318 (5847), 97-100 (2007). Huson, D.H. et al. “MEGAN analysis of metagenomic data.” Genome Research 17 (3), 377-386 (2007). NEW Jorge Frias-Lopez, J. et al. “Microbial community gene expression in ocean surface waters.” Proc Natl. Acad. Sci. USA, 105 (10), 3805–3810 (2008). NEW Krause, L. et al. “Phylogenetic classification of short environmental DNA fragments.” Nucleic Acids Research, (February 19, 2008) [Epub ahead of print]. Krause, L. et al. “Finding novel genes in bacterial communities isolated from the environment.” Bioinformatics 22 (14), e281-e289 (2006). Leininger, S. et al. “Archaea predominate among ammonia-oxidizing prokaryotes in soils.” Nature 442 (7104), 806-809 (2006). NEW Luiz F. W. et al. “Pyrosequencing enumerates and contrasts soil microbial diversity.” The ISME Journal, 1, 283–290 (2007). indicates journal articles that have been published since February 12, 2008. NEW McKenna P. et al. “The Macaque Gut Microbiome in Health, Lentiviral Infection, and Chronic Enterocolitis.” PLoS Pathogens, 4 (2), e20 (2008). NEW Mulholland P. J. et al. “Stream denitrification across biomes and its response to anthropogenic nitrate loading.” Nature 452, 202-205 (2008). |8 roche.qxp:GSFLX_Biblio_05051347001.qxd 4/29/08 9:00 PM Page 10 Metagenomics and Microbial Diversity (continued) Palacios, G. et al. “A new arenavirus in a cluster of fatal transplant-associated diseases.” The New England Journal of Medicine 358; published at www.nejm.org on February 6, 2008; doi:10.1056/NEJMoa073785 (2008). Ancient DNA Sogin, M.L. et al. “Microbial diversity in the deep sea and the underexplored ‘rare biosphere’.” Proc. Natl. Acad. Sci. USA 103 (32), 12115-12120 (2006). Briggs, A.W. et al. “Patterns of damage in genomic DNA sequences from a Neandertal.” Proc. Natl. Acad. Sci. USA 104 (37), 14616-14621 (2007). Sundquist, A. et al. “Bacterial flora typing with deep, targeted, chip-based Pyrosequencing.” BMC Microbiology 7, 108 (2007). Gilbert, M.T. et al. “Recharacterization of ancient DNA miscoding lesions: insights in the era of sequencingby-synthesis.” Nucleic Acids Research 35 (1), 1-10 (2007). Turnbaugh, P.J. et al. “An obesity-associated gut microbiome with increased capacity for energy harvest.” Nature 444 (7122), 1027-1031 (2006). Gilbert, M.T. et al. “Whole-genome shotgun sequencing of mitochondria from ancient hair shafts.” Science 317 (5846), 1927-1930 (2007). Warnecke, F. et al. “Metagenomic and functional analysis of hindgut microbiota of a wood-feeding higher termite.” Nature 450 (7169), 560-565 (2007). Green, R.E. et al. “Analysis of one million base pairs of Neanderthal DNA.” Nature, 444 (7117), 330-336 (2006). Wegley, L. et al. “Metagenomic analysis of the microbial community associated with the coral Porites astreoides.” Environmental Microbiology 9 (11), 2707-2719 (2007). ChIP Sequencing/Epigentics/ Structural Variation Albert, I. et al. “Translational and rotational settings of H2A.Z nucleosomes across the Saccharomyces cerevisiae genome.” Nature 446 (7135), 572-576 (2007). Bhinge, A.A. et al. “Mapping the chromosomal targets of STAT1 by Sequence Tag Analysis of Genomic Enrichment (STAGE).” Genome Research 17 (6), 910-916 (2007). Boyle, A.P. et al. “High-resolution mapping and characterization of open chromatin across the genome.” Cell 132 (2), 311-322 (2008). NEW Chen, J. et al. “Scanning the human genome at kilobase resolution.” Genome Research (published online Feb 21, 2008). Dostie, J. et al. “Chromosome conformation capture carbon copy (5C): a massively parallel solution for mapping interactions between genomic elements.” Genome Research 16 (10), 1299-1309 (2006). Johnson, S.M. et al. “Flexibility and constraint in the nucleosome core landscape of Caenorhabditis elegans chromatin.” Genome Research 16 (12), 1505-1516 (2006). Korbel, J.O. et al. “Paired-end mapping reveals extensive structural variation in the human genome.” Science 318 (5849), 420-426 (2007). Nagel, S. et al. “Activation of TLX3 and NKX2-5 in t(5;14) (q35;q32) T-Cell acute lymphoblastic leukemia by remote 3'-BCL11B enhancers and coregulation by PU.1 and HMGA1.” Cancer Research 67 (4), 1461-1471 (2007). 9| Swaminathan, K., Varala, K., and Hudson, M.E. “Global repeat discovery and estimation of genomic copy number in a large, complex genome using a highthroughput 454 sequence survey.” BMC Genomics 8, 132 (2007). Noonan, J.P. et al. “Sequencing and analysis of Neanderthal genomic DNA.” Science 314 (5802), 11131118 (2006). Poinar, H.N. et al. “Metagenomics to paleogenomics: large-scale sequencing of mammoth DNA.” Science 311 (5759), 392-394 (2006). Stiller, M. et al. “Patterns of nucleotide misincorporations during enzymatic amplification and direct large-scale sequencing of ancient DNA.” Proc. Natl. Acad. Sci. USA 103 (37), 13578-13584 (2006). Whole Genome Sequencing Andries, K. et al. “A diarylquinoline drug active on the ATP synthase of Mycobacterium tuberculosis.” Science 307 (5707), 223-227 (2005). NEW Bekal S. et al. “Genomic DNA sequence comparison between two inbred soybean cyst nematode biotypes facilitated by massively parallel 454 micro-bead sequencing.” Mol Genet Genomics, 2008 Mar 7 [Epub ahead of print]. Chen, Y. et al. “The capsule polysaccharide structure and biogenesis for non-O1 Vibrio cholerae NRT36S: genes are embedded in the LPS region.” BMC Microbiology 7, 20 (2007). Chen, Y. et al. “The genome of non-O1 Vibrio cholerae NRT36S demonstrates the presence of pathogenic mechanisms that are distinct from O1 Vibrio cholerae.” Infection and Immununity, 75 (5), 2645-2647 (2007). Deng, L. et al. “Identification of novel antipoxviral agents: mitoxantrone inhibits vaccinia virus replication by blocking virion assembly.” Journal of Virology 81 (24), 13392-13402 (2007). NEW Giannakis M. et al. “Helicobacter pylori evolution during progression from chronic atrophic gastritis to gastric cancer and its impact on gastric stem cells.” Proc Natl. Acad. Sci. USA, 105 (11), 4358-4363 (2008). roche.qxp:GSFLX_Biblio_05051347001.qxd 4/29/08 9:00 PM Whole Genome Sequencing (continued) Gioia, J. et al. “Paradoxical DNA repair and peroxide resistance gene conservation in Bacillus pumilus SAFR032.” PLoS ONE 2 (9), e928 (2007). Goldberg, S.M. et al. “A Sanger/pyrosequencing hybrid approach for the generation of high-quality draft assemblies of marine microbial genomes.” Proc. Natl. Acad. Sci. USA 103 (30), 11240-11245 (2006). Hallen, H.E. et al. “Gene family encoding the major toxins of lethal Amanita mushrooms.” Proc. Natl. Acad. Sci. USA 104 (48), 19097-19101 (2007). Highlander, S.K. et al. “Subtle genetic changes enhance virulence of methicillin resistant and sensitive Staphylococcus aureus.” BMC Microbiology 7, 99 (2007). Hiller, N.L. et al. “Comparative genomic analyses of seventeen Streptococcus pneumoniae strains: insights into the pneumococcal supragenome.” Journal of Bacteriology 189 (22), 8186-8195 (2007). Hofreuter, D. et al. “Unique features of a highly pathogenic Campylobacter jejuni strain.” Infection and Immunity 74 (8), 4694-4707 (2006). Hogg, J.S. et al. “Characterization and modeling of the Haemophilus influenzae core- and supra-genomes based on the complete genomic sequences of Rd and 12 clinical nontypeable strains.” Genome Biology 8 (6), R103 (2007). NEW Hongoh, Y. et al. “Complete genome of the uncultured Termite Group 1 bacteria in a single host protist cell.” Proc Natl. Acad. Sci. USA, 105 (14), 5555-5560 (2008) Lasken, R.S. and Stockwell, T.B. “Mechanism of chimera formation during the Multiple Displacement Amplification Reaction.” BMC Biotechnology 7, 19 (2007). NEW La Scola, B. et al. “Rapid comparative genomic analysis for clinical microbiology: The Francisella tularensis paradigm.” Genome Research, published online Apr 11, 2008 NEW Lee, J.S. et al. “Mutation in the Transcriptional Regulator PhoP Contributes to Avirulence of Mycobacterium tuberculosis H37Ra Strain.” Cell Host & Microbe, 3, 97–103 (2008). Page 11 DNA in the pea (Pisum sativum L.) genome: comprehensive characterization using 454 sequencing and comparison to soybean and Medicago truncatula.” BMC Genomics 8, 427 (2007). Malhi, R.S. et al. “MamuSNP: a resource for rhesus macaque (Macaca mulatta) genomics.” PLOS One 2 (5), e438 (2007). McCutcheon, J.P. and Moran, N.A. “Parallel genomic evolution and metabolic interdependence in an ancient symbiosis.” Proc. Natl. Acad. Sci. USA 104 (49), 19392–19397 (2007). Mouton, L. et al. “Variable-number tandem repeats as molecular markers for biotypes of Pasteuria ramosa in Daphnia spp.” Applied and Environmental Microbiology 73 (11), 3715–3718 (2007). Mußmann, M. et al. “Insights into the genome of large sulfur bacteria revealed by analysis of single filaments.” PLoS Biology 5 (9), e230 (2007). Oh, J.D. et al. “The complete genome sequence of a chronic atrophic gastritis Helicobacter pylori strain: evolution during disease progression.” Proc. Natl. Acad. Sci. USA 103 (26), 9999-10004 (2006). Pearson, B.M. et al. “The complete genome sequence of Campylobacter jejuni strain 81116 (NCTC11828).” Journal of Bacteriology 189 (22), 8402-8403 (2007). Pinard, R. et al. “Assessment of whole genome amplification-induced bias through high-throughput, massively parallel whole genome sequencing.” BMC Genomics 7, 216 (2006). NEW Paustian, M.L. et al. “Comparative genomic analysis of Mycobacterium avium subspecies obtained from multiple host species.” BMC Genomics, 9 (1), 135 (2008). Pol, A. et al. “Methanotrophy below pH 1 by a new Verrucomicrobia species.” Nature 450 (7171), 874-878 (2007). Poly, F. et al. “Genome sequence of a clinical isolate of Campylobacter jejuni from Thailand.” Infection and Immunity 75 (7), 3425-3433 (2007). Raymond, J.A., Fritsen, C., and Shen, K. “An ice-binding protein from an Antarctic sea ice bacterium.” FEMS Microbiology Ecology 61 (2), 214-221 (2007). Macas, J., Neumann, P., and Navratilova, A. “Repetitive Flowgram showing a single read of 256 bases. Each bar represents a discrete base (A, T, C, or G), and the height of a bar correlates to the number of bases in a specific position. | 10 roche.qxp:GSFLX_Biblio_05051347001.qxd 4/29/08 9:00 PM Page 12 Whole Genome Sequencing (continued) Smith, M.G. et al. “New insights into Acinetobacter baumannii pathogenesis revealed by high-density pyrosequencing and transposon mutagenesis.” Genes & Development 21 (5), 614 (2007). NEW Solda, G. et al. “Non-random retention of protein-coding overlapping genes in Metazoa.” BMC Genomics, 9(1), 174 (2008). NEW Stephen J. Spatz, S.J. and Rue, C. A, “Sequence determination of a mildly virulent strain (CU-2) of Gallid herpesvirus type 2 using 454 pyrosequencing.” Virus Genes, DOI 10.1007/s11262-008-0213-5 (2008). Swingley, W.D. et al. “Niche adaptation and genome expansion in the chlorophyll d-producing cyanobacterium Acaryochloris marina.” Proc. Natl. Acad. Sci. USA 105 (6), 2005-2010 (2008). NEW Tauch, A. et al. “The lifestyle of Corynebacterium urealyticum derived from its complete genome sequence established by pyrosequencing.” Journal of Biotechnology, Available online 10 March 2008. NEW Tauch, A. et al. “Ultrafast pyrosequencing of Corynebacterium kroppenstedtii DSM44385 revealed insights into the physiology of a lipophilic corynebacterium that lacks mycolic acids.” Journal of Biotechnology, Available online 20 March 2008 Thomson, N.R. et al. “Chlamydia trachomatis: genome sequence analysis of lymphogranuloma venereum isolates.” Genome Research 18 (1), 161-171 (2008). Velicer, G.J. et al. “Comprehensive mutation identification in an evolved bacterial cooperator and its cheating ancestor.” Proc. Natl. Acad. Sci. USA 103 (21), 8107-8112 (2006). Velasco, R. et al. “A high quality draft consensus sequence of the genome of a heterozygous grapevine variety.” PLoS One 2 (12), e1326 (2007). Wackett, L.P. et al. “Genomic and biochemical studies demonstrating the absence of an alkane-producing phenotype in Vibrio furnissii M1.” Applied and Environmental Microbiology 73 (22), 7192-7198 (2007). NEW Wheeler, D.A. et al. “The complete genome of an individual by massively parallel DNA sequencing.” Nature 452, 872 - 876 (2008). Zuber, S. et al. “Genome analysis of phage JS98 defines a fourth major subgroup of T4-like phages in Escherichia coli.” Journal of Bacteriology 189 (22), 8206-8214 (2007). View a selection of these publications and learn more about the Genome Sequencer FLX System at www.genome-sequencing.com NEW Thomas, J.A. et al. “Complete genomic sequence and mass spectrometric analysis of highly diverse, atypical Bacillus thuringiensis phage 0305varphi8-36.” Virology 368 (2), 405-421 (2007). 11 | indicates journal articles that have been published since February 12, 2008. roche.qxp:GSFLX_Biblio_05051347001.qxd 4/29/08 9:00 PM Transcriptomes Sequencing Bainbridge, M.N. et al. “Analysis of the prostate cancer cell line LNCaP transcriptome using a sequencing-bysynthesis approach.” BMC Genomics 7, 246 (2006). Barbazuk, W.B. et al. “SNP discovery via 454 transcriptome sequencing.” The Plant Journal 51 (5), 910-918 (2007). Cheung, F. et al. “Sequencing Medicago truncatula expressed sequenced tags using 454 Life Sciences technology.” BMC Genomics 7, 272 (2006). Emrich, S.J. et al. “Gene discovery and annotation using LCM-454 transcriptome sequencing.” Genome Research 17 (1), 69-73 (2007). Eveland, A.L., McCarty, D.R., and Koch, K.E. “Transcript profiling by 3'-untranslated region sequencing resolves expression of gene families.” Plant Physiology 146 (1), 32-44 (2008). Feng, H. et al. “Clonal integration of a polyomavirus in human Merkel cell carcinoma.” Science Published online January 17, 2008; doi:10.1126/science.1152586 (2008). Gowda, M. et al. “Robust analysis of 5'-transcript ends (5'-RATE): a novel technique for transcriptome analysis and genome annotation.” Nucleic Acids Research 34 (19), e126 (2006). Jones-Rhoades M.W., Borevitz J.O., and Preuss, D. “Genome-wide expression profiling of the Arabidopsis female gametophyte identifies families of small, secreted proteins.” PLoS Genetics 3 (10), e171 (2007). Ng, P. et al. “Multiplex sequencing of paired-end ditags (MS-PET): a strategy for the ultra-high-throughput analysis of transcriptomes and genomes.” Nucleic Acids Research 34 (12), e84 (2006). Nielsen, K.L. et al. “DeepSAGE-digital transcriptomics with high sensitivity, simple experimental protocol and multiplexing of samples.” Nucleic Acids Research 34 (19), e133 (2006). Ohtsu, K. et al. “Global gene expression analysis of the shoot apical meristem of maize.” (Zea mays L.). The Plant Journal 52 (3), 391-404 (2007). Peano, C. et al. “Complete gene expression profiling of Saccharopolyspora erythraea using GeneChip DNA microarrays.” Microbial Cell Factories 6 (1), 37; Published online November 26, 2007; doi:10.1186/1475-2859-6-37 (2007). Robb, S.M., Ross, E., and Alvarado, A.S. “SmedGD: the Schmidtea mediterranea genome database.” Nucleic Acids Research 36 (Database issue), D599-D606 (2008). Ruan, Y. et al. “Fusion transcripts and transcribed retrotransposed loci discovered through comprehensive transcriptome analysis using Paired-End diTags (PETs).” Genome Research 17 (6), 828-838 (2007). Page 13 Torres, T.T. et al. “Gene expression profiling by massively parallel sequencing.” Genome Research 18 (1), 172-177 (2008). Toth, A.L. et al. “Wasp gene expression supports an evolutionary link between maternal behavior and eusociality.” Science 318 (5849), 441-444 (2007). Vera, J.C. et al. “Rapid transcriptome characterization for a nonmodel organism using 454 pyrosequencing.” Molecular Ecology (OnlineEarly Articles) Published online February 5, 2008; doi:10.1111/j.1365-294X.2008.03666.x (2008). NEW Wang, X. et al. “Using RNase sequence specificity to refine the identification of RNA-protein binding regions.” BMC Genomics, 9 (Suppl 1), S17 (2008). Weber, A.P.M. et al. “Sampling the Arabidopsis transcriptome with massively-parallel pyrosequencing.” Plant Physiology 144 (1), 32-42 (2007). Amplicons and Methylation Binladen, J. et al. “The use of coded PCR primers enables high-throughput sequencing of multiple homolog amplification products by 454 parallel sequencing.” PLoS ONE 2 (2), e197 (2007). Dahl, F. et al. “Multigene amplification and massively parallel sequencing for cancer mutation discovery.” Proc. Natl. Acad. Sci. USA 104 (22), 9387-9392 (2007). Korshunova, Y. et al. “Massively parallel bisulphite pyrosequencing reveals the molecular complexity of breast cancer-associated cytosine-methylation patterns obtained from tissue and serum DNA.” Genome Research 18 (1), 19-29 (2008). Marshall, H.M. et al. “Role of PSIP1/LEDGF/p75 in lentiviral infectivity and integration targeting.” PLoS ONE 2 (12), e1340 (2007). Ordway, J.M. et al. “Identification of novel highfrequency DNA methylation changes in breast cancer.” PLoS ONE 2 (12), e1314 (2007). Pettersson, E. et al. “Allelotyping by massively parallel pyrosequencing of SNP-carrying trinucleotide threads.” Human Mutation 29 (2), 323-329 (2008). Taylor, K.H. et al. “Ultradeep bisulfite sequencing analysis of DNA methylation patterns in multiple gene promoters by 454 sequencing.” Cancer Research 67 (18), 8511-8518 (2007). Thomas, R.K. et al. “Sensitive mutation detection in heterogeneous cancer specimens by massively parallel picoliter reactor sequencing.” Nature Medicine 12 (7), 852-855 (2006). Thomas, R.K. et al. “High-throughput oncogene mutation profiling in human cancer.” Nature Genetics 39 (3), 347351 (2007). NEW Sugarbaker, D. J., et al. “Transcriptome sequencing of malignant pleural mesothelioma tumors.” Proc Natl. Acad. Sci. USA, 105 (9), 3521–3526 (2008). | 12 roche.qxp:GSFLX_Biblio_05051347001.qxd 4/29/08 9:00 PM Page 14 Technology and Bioinformatics Brockman, W. et al. “Quality scores and SNP detection in sequencing-by-synthesis systems.” Genome Research Published online ahead of print January 22, 2008 as doi:10.1101/gr.070227.107 (2008). Chaisson, M.J. and Pevzner, P.A. “Short read fragment assembly of bacterial genomes.” Genome Research 18 (2), 324-330 (2008). NEW Hamady, M. et al. “Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex.” Nature Methods 5, 235 - 237 (2008). Huse, S.M. et al. “Accuracy and quality of massivelyparallel DNA pyrosequencing.” Genome Biology 8 (7), R143 (2007). Liu, Z. et al. “Short pyrosequencing reads suffice for accurate microbial community analysis.” Nucleic Acids Research 35 (18), e120 (2007). Marcy, Y. et al. “Nanoliter reactors improve multiple displacement amplification of genomes from single cells.” PLoS Genetics 3 (9), e155 (2007). Margulies, M. et al. “Genome sequencing in microfabricated high-density picolitre reactors.” Nature 437 (7057), 376-380 (2005). NEW Meyer, M., Stenzel, U., and Hofreiter, M. “Parallel tagged sequencing on the 454 platform” Nature Protocols 3, 267-278 (2008). Meyer, M. et al. “From micrograms to picograms: quantitative PCR reduces the material demands of high-throughput sequencing.” Nucleic Acids Research 36 (1), e5 (2008). Sequence Capture Arrays (Sample Enrichment) Albert, T.J. et al. “Direct selection of human genomic loci by microarray hybridization.” Nature Methods 4 (11), 903905 (2007). Small RNA Aravin, A.A. et al. “Developmentally regulated piRNA clusters implicate MILI in transposon control.” Science 316 (5825), 744-747 (2007). Axtell, M.J. et al. “A two-hit trigger for siRNA biogenesis in plants.” Cell 127 (3), 565–577 (2006). Axtell, M.J., Snyder, J.A., and Bartel, D.P. “Common functions for diverse small RNAs of land plants.” Plant Cell 19 (6), 1750-1769 (2007). Barakat, A. et al. “Conservation and divergence of microRNAs in Populus.” BMC Genomics 8, 481 (2007). Barakat, A. et al. “Large-scale identification of microRNAs from a basal eudicot (Eschscholzia californica) and conservation in flowering plants.” The Plant Journal 51 (6), 991-1003 (2007). Berezikov, E. et al. “Diversity of microRNAs in human and chimpanzee brain.” Nature Genetics 38 (12), 1375-1377 (2006). Brennecke, J. et al. “Discrete small RNA-generating loci as master regulators of transposon activity in Drosophila.” Cell 128 (6), 1089 -1103 (2007). Burnside, J. et al. “Marek’s disease virus encodes microRNAs that map to meq and the latency-associated transcript.” Journal of Virology 80 (17), 8778-8786 (2006). Parameswaran, P. et al. “A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing.” Nucleic Acids Research 35 (19), e130 (2007). Calabrese, J.M. et al. “RNA sequence analysis defines Dicer's role in mouse embryonic stem cells.” Proc. Natl. Acad. Sci. USA 104 (46), 18097-18102 (2007). Quinlan, A.R. et al. “Pyrobayes: an improved base caller for SNP discovery in pyrosequences.” Nature Methods 5 (2), 179-181 (2008). Fahlgren, N. et al. “High-throughput sequencing of Arabidopsis microRNAs: evidence for frequent birth and death of MIRNA genes.” PLoS ONE 2 (2), e219 (2007). Trombetti, G.A. et al. “Data handling strategies for high throughput pyrosequencers.” BMC Bioinformatics 8 (Suppl 1), S22 (2007). Girard, A. et al. “A germline-specific class of small RNAs binds mammalian Piwi proteins.” Nature 442 (7099), 199-202 (2006). NEW Vacic, V. et al. “A probabilistic method for small RNA flowgram matching.” Pacific Symposium on Biocomputing; 75-86 (2008). Henderson, I.R. et al. “Dissecting Arabidopsis thaliana DICER function in small RNA processing, gene silencing and DNA methylation patterning.” Nature Genetics 38 (6), 721-725 (2006). van Orsouw, N.J. et al. “Complexity Reduction of Polymorphic Sequences (CRoPSTM): a novel approach for large-scale polymorphism discovery in complex genomes.” PLoS ONE 2 (11), e1172 (2007). Houwing, S. et al. “A role for Piwi and piRNAs in germ cell maintenance and transposon silencing in zebrafish.” Cell 129 (1), 69-82 (2007). Howell, M.D. et al. “Genome-wide analysis of the RNA-DEPENDENT RNA POLYMERASE6/DICER-LIKE4 pathway in Arabidopsis reveals dependency on miRNA- and tasiRNA-directed targeting.” Plant Cell 19 (3), 926-942 (2007). 13 | roche.qxp:GSFLX_Biblio_05051347001.qxd 4/29/08 9:00 PM Small RNA (continued) Jin, H. et al. “Small RNAs and the regulation of cis-natural antisense transcripts in Arabidopsis.” BMC Molecular Biology 9 (1), 6 Published online ahead of print January 14, 2008; doi:10.1186/1471-2199-9-6 (2008). Johnson, C. et al. “CSRDB: a small RNA integrated database and browser resource for cereals.” Nucleic Acids Research 35 (Database issue), D829-D833 (2007). Kasschau, K.D. et al. “Genome-wide profiling and analysis of Arabidopsis siRNAs.” PLoS Biology 5 (3), e57 (2007). Katiyar-Agarwal, S. et al. “A novel class of bacteriainduced small RNAs in Arabidopsis.” Genes & Development 21 (23), 3123-3134 (2007). Lau, N.C. et al. “Characterization of the piRNA complex from rat testes.” Science 313 (5785), 363-367 (2006). NEW Lu, J. et al. “The birth and death of microRNA genes in Drosophila.” Nature Genetics, 40 (3), 351-355 (2008). Lu, C. et al. “MicroRNAs and other small RNAs enriched in the Arabidopsis RNA-dependent RNA polymerase-2 mutant.” Genome Research 16 (10), 1276-1288 (2006). Molnár, A. et al. “miRNAs control gene expression in the single-cell alga Chlamydomonas reinhardtii.” Nature 447 (7148), 1126-1129 (2007). NEW Mosher RA, et al. “PolIVb influences RNA-directed DNA methylation independently of its role in siRNA biogenesis.” Proc Natl. Acad. Sci. USA,105 (8), 3145-3150 (2008). Pak, J. and Fire, A. “Distinct populations of primary and secondary effectors during RNAi in C. elegans.” Science 315 (5809), 241-244 (2007). Page 15 NEW Seitz, H., Ghildiyal, M., and Zamore, P. D. “Argonaute loading improves the 5' precision of both MicroRNAs and their miRNA strands in flies.” Current Biology, 18 (2), 147-51 (2008). NEW Sunkar, R. et al. “Identification of novel and candidate miRNAs in rice by high throughput Sequencing.” BMC Plant Biology, 8, 25 (2008). Tarasov, V. et al. “Differential regulation of microRNAs by p53 revealed by massively parallel sequencing.” Cell Cycle 6 (13), 1586-1593 (2007). Tyler, D.M. et al. “Functionally distinct regulatory RNAs generated by bidirectional transcription and processing of microRNA loci.” Genes & Development 22 (1), 26-36 (2008). Yao, Y. et al. “Cloning and characterization of microRNAs from wheat (Triticum aestivum L.).” Genome Biology 8 (6), R96 (2007). Zhang, X. et al. “Role of RNA polymerase IV in plant small RNA metabolism.” Proc. Natl. Acad. Sci. USA 104 (11), 4536-4541 (2007). Zhang, L. et al. “Systematic identification of C. elegans miRISC proteins, miRNAs, and mRNA targets by their interactions with GW182 proteins AIN-1 and AIN-2.” Molecular Cell 28 (4), 598–613 (2007). Zhao, T. et al. “A complex system of small RNAs in the unicellular green alga Chlamydomonas reinhardtii.” Genes & Development 21 (10), 1190-1203 (2007). View a selection of these publications and learn more about the Genome Sequencer FLX System at www.genome-sequencing.com NEW indicates journal articles that have been published since February 12, 2008. NEW Pandey, S.P. et al. “Herbivory-induced changes in the small-RNA transcriptome and phytohormone signaling in Nicotiana attenuate.” Proc Natl. Acad. Sci. USA, 2008 Mar 13 [Epub ahead of print]. Qi, Y. et al. “Distinct catalytic and non-catalytic roles of ARGONAUTE4 in RNA-directed DNA methylation.” Nature 443 (7114), 1008-1112 (2006). Rajagopalan, R. et al. “A diverse and evolutionarily fluid set of microRNAs in Arabidopsis thaliana.” Genes & Development 20 (24), 3407-3425 (2006). Ruby, J.G. et al. “Evolution, biogenesis, expression, and target predictions of a substantially expanded set of Drosophila microRNAs.” Genome Research 17 (12), 1850-1864 (2007). Ruby, J.G., Jan, C.H., and Bartel, D.P. “Intronic microRNA precursors that bypass Drosha processing.” Nature 448 (7149), 83-86 (2007). Ruby, J.G. et al. “Large-scale sequencing reveals 21U-RNAs and additional microRNAs and endogenous siRNAs in C. elegans.” Cell 127 (6), 1193-1207 (2006). | 14 roche.qxp:GSFLX_Biblio_05051347001.qxd 4/29/08 9:01 PM Page 16 Published by Roche Diagnostics Roche Applied Science Indianapolis, Indiana © 2008 Roche Diagnostics. All rights reserved. Printed in the USA. For life science research only. Not for use in diagnostic procedures. 454, 454 LIFE SCIENCES, and 454 SEQUENCING are trademarks of 454 Life Sciences Corporation, Branford, CT, USA. Other brands or product names are trademarks of their respective holders. 05051347001 • 04/08 www.roche-applied-science.com