GS FLX Titanium Sequencing

advertisement
Genome Sequencer FLX+
Whole Genome Sequencing
For Life Science Research use only. Not for diagnostic purposes
Project Information
Client
C1383 - SV912148
Patrick Delahunty
Medicinal Genomics
Sample Information
SID
Sample
Sample Type
18052 1 (rec'd 27JUNE11) Genomic DNA
Quote
SW0001MEDG0511
Deliverable
1. 454 will prepare one library for the customer supplied plant sample.
2. 454 will sequence this library in 12 runs using the GS FLX+ chemistry. Typical yield is 900,000
to 1.3 million reads per run, however actual yield may vary. PLEASE NOTE: Customer
would like 454 to provide run QC information demonstrating performace of the library.
3. 454 will assemble the resulting data set using the latest released version of the GS Assembler
software. PLEASE NOTE: Customer would like to receive initial assembly results
following the completion of 6 runs in order to assess the need for the 6 additional
sequencing runs.
4. 454 will provide all quality filtered reads and associated quality scores in FASTA format along
with assembly results.
Delivery Date
August 01, 2011
For Life Science Research use only. Not for diagnostic purposes
Sequencing Results
Run Name
R_2011_07_12_13_43_18_sc20_kontoudp_400_7075_93839220_SV912148_18052
D_2011_07_13_10_42_02_frontend2_fullProcessing (2.6)
Run Statistics
Region
SID
1
18052
2
Sample
1 (rec'd
27JUNE11)
1 (rec'd
18052
27JUNE11)
HQ Reads
HQ Bases
Avg
Read
Length
Mode
Read %Mixed
Length
%Dots
700,573 448,334,655
640
748
9.41%
3.75%
712,483 448,772,334
630
732
9.96% 3.65%
Table 1.0 - Summary of run metrics %Mixed displays the percentage of reads filtered out by the
mixed filter, where a mixed read is the result of simultaneously sequencing a mixture of different
DNA molecules. %Dots shows the percentage of reads filtered by the dots filter. A dot is an
instance of 3 successive nucleotide flows that record no incorporation.
Read Length Distribution
Figure 1.0 - Read length distribution of high quality reads. R1 = 1 (rec'd 27JUNE11), R2 = 1 (rec'd
27JUNE11).
For Life Science Research use only. Not for diagnostic purposes
Run Name
R_2011_07_25_16_28_56_sc17_bonvinl1_400_7075_93821020_912148_18052
D_2011_07_26_13_34_02_frontend2_fullProcessing (2.6)
Run Statistics
Regio
n
1
2
SID
1805
2
1805
2
Sample
HQ Reads
HQ Bases
1 (rec'd
605,878 355,813,952
27JUNE11)
1 (rec'd
335,963,27
552,550
27JUNE11)
3
Avg
Mode
Read
Read
Length Length
%Mixed
%Dots
587
718
12.95% 3.68%
608
750
14.69% 5.48%
Table 2.0 - Summary of run metrics %Mixed displays the percentage of reads filtered out by the
mixed filter, where a mixed read is the result of simultaneously sequencing a mixture of different
DNA molecules. %Dots shows the percentage of reads filtered by the dots filter. A dot is an
instance of 3 successive nucleotide flows that record no incorporation.
Read Length Distribution
Figure 2.0 - Read length distribution of high quality reads. R1 = 1 (rec'd 27JUNE11), R2 = 1 (rec'd
27JUNE11).
For Life Science Research use only. Not for diagnostic purposes
Run Name
R_2011_07_25_12_12_32_sc21_bonvinl1_400_7075_93821020_912148_18052
D_2011_07_26_13_30_02_frontend2_fullProcessing (2.6)
Run Statistics
Region
SID
1
18052
2
Sample
HQ Reads
HQ Bases
1 (rec'd
631,895 376,001,214
27JUNE11)
1 (rec'd
18052
564,899 328,045,722
27JUNE11)
Avg
Mode
Read
Read
Length Length
%Mixed
%Dots
3.78%
595
749
9.67%
581
757
12.41% 3.97%
Table 3.0 - Summary of run metrics %Mixed displays the percentage of reads filtered out by the
mixed filter, where a mixed read is the result of simultaneously sequencing a mixture of different
DNA molecules. %Dots shows the percentage of reads filtered by the dots filter. A dot is an
instance of 3 successive nucleotide flows that record no incorporation.
Read Length Distribution
Figure 3.0 - Read length distribution of high quality reads. R1 = 1 (rec'd 27JUNE11), R2 = 1 (rec'd
27JUNE11).
For Life Science Research use only. Not for diagnostic purposes
Run Name
R_2011_07_27_15_26_35_sc17_bonvinl1_400_7075_93821020_912148_18052
D_2011_07_28_12_32_02_frontend2_fullProcessing (2.6)
Run Statistics
Region
SID
1
18052
2
Sample
HQ Reads
HQ Bases
1 (rec'd
588,357 335,946,095
27JUNE11)
1 (rec'd
18052
596,933 340,072,586
27JUNE11)
Avg
Mode
Read
Read
Length Length
%Mixed
%Dots
571
688
14.57% 3.75%
570
710
13.26% 4.01%
Table 4.0 - Summary of run metrics %Mixed displays the percentage of reads filtered out by the
mixed filter, where a mixed read is the result of simultaneously sequencing a mixture of different
DNA molecules. %Dots shows the percentage of reads filtered by the dots filter. A dot is an
instance of 3 successive nucleotide flows that record no incorporation.
Read Length Distribution
Figure 4.0 - Read length distribution of high quality reads. R1 = 1 (rec'd 27JUNE11), R2 = 1 (rec'd
27JUNE11).
For Life Science Research use only. Not for diagnostic purposes
Data Package
The following data files are provided as the delivered product and are available
for download at the Roche sFTP site. Please use sftp to obtain the files using a
Unix/Linux server. Be sure to type 'bin' to specify binary mode before you 'get'
the file. (Alternatively, you can use a browser from Windows or Mac)
User Name : X
Password : X
Server
:X
The files are compressed and packaged in .tgz format in Linux, and uploaded to
the sFTP server. After downloading (make sure binary file format is specified
during download), the .tgz file can be unpacked on a Linux/unix machine by using
the command:
tar zxvf samplefile.tgz
The archive can also be unpacked by WinZip or WinRar programs on the Windows
platform. However, it is recommended that the customer use a Linux/Unix based
operating system to manipulate the files since the volume of data provided is
usually much larger than can be comfortably opened and viewed using Microsoft
Word or Excel.
File contents: (files shown are after unpacking with tar)
R20110712sc20_sffread.tgz
R_2011_07_12_13_43_18_sc20_kontoudp_400_7075_93839220_SV912148_18052/
R_2011_07_12_13_43_18_sc20_kontoudp_400_7075_93839220_SV912148_18052/D_2011_07_13_10_42_02_frontend2_f
ullProcessing/
R_2011_07_12_13_43_18_sc20_kontoudp_400_7075_93839220_SV912148_18052/D_2011_07_13_10_42_02_frontend2_f
ullProcessing/sff/
R_2011_07_12_13_43_18_sc20_kontoudp_400_7075_93839220_SV912148_18052/D_2011_07_13_10_42_02_frontend2_f
ullProcessing/sff/G5UGEGY02.sff
R_2011_07_12_13_43_18_sc20_kontoudp_400_7075_93839220_SV912148_18052/D_2011_07_13_10_42_02_frontend2_f
ullProcessing/sff/G5UGEGY01.sff
R_2011_07_12_13_43_18_sc20_kontoudp_400_7075_93839220_SV912148_18052/D_2011_07_13_10_42_02_frontend2_f
ullProcessing/reads/
R_2011_07_12_13_43_18_sc20_kontoudp_400_7075_93839220_SV912148_18052/D_2011_07_13_10_42_02_frontend2_f
ullProcessing/reads/1.454Reads.fna
R_2011_07_12_13_43_18_sc20_kontoudp_400_7075_93839220_SV912148_18052/D_2011_07_13_10_42_02_frontend2_f
ullProcessing/reads/2.454Reads.fna
R_2011_07_12_13_43_18_sc20_kontoudp_400_7075_93839220_SV912148_18052/D_2011_07_13_10_42_02_frontend2_f
ullProcessing/reads/2.454Reads.qual
For Life Science Research use only. Not for diagnostic purposes
R_2011_07_12_13_43_18_sc20_kontoudp_400_7075_93839220_SV912148_18052/D_2011_07_13_10_42_02_frontend2_f
ullProcessing/reads/1.454Reads.qual
R20110725sc17_sffread.tgz
R_2011_07_25_16_28_56_sc17_bonvinl1_400_7075_93821020_912148_18052/
R_2011_07_25_16_28_56_sc17_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_26_13_34_02_frontend2_fullP
rocessing/
R_2011_07_25_16_28_56_sc17_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_26_13_34_02_frontend2_fullP
rocessing/sff/
R_2011_07_25_16_28_56_sc17_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_26_13_34_02_frontend2_fullP
rocessing/sff/G6IQQIE02.sff
R_2011_07_25_16_28_56_sc17_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_26_13_34_02_frontend2_fullP
rocessing/sff/G6IQQIE01.sff
R_2011_07_25_16_28_56_sc17_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_26_13_34_02_frontend2_fullP
rocessing/reads/
R_2011_07_25_16_28_56_sc17_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_26_13_34_02_frontend2_fullP
rocessing/reads/1.454Reads.fna
R_2011_07_25_16_28_56_sc17_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_26_13_34_02_frontend2_fullP
rocessing/reads/2.454Reads.fna
R_2011_07_25_16_28_56_sc17_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_26_13_34_02_frontend2_fullP
rocessing/reads/2.454Reads.qual
R_2011_07_25_16_28_56_sc17_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_26_13_34_02_frontend2_fullP
rocessing/reads/1.454Reads.qual
R20110725sc21_sffread.tgz
R_2011_07_25_12_12_32_sc21_bonvinl1_400_7075_93821020_912148_18052/
R_2011_07_25_12_12_32_sc21_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_26_13_30_02_frontend2_fullP
rocessing/
R_2011_07_25_12_12_32_sc21_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_26_13_30_02_frontend2_fullP
rocessing/sff/
R_2011_07_25_12_12_32_sc21_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_26_13_30_02_frontend2_fullP
rocessing/sff/G6IEU6N02.sff
R_2011_07_25_12_12_32_sc21_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_26_13_30_02_frontend2_fullP
rocessing/sff/G6IEU6N01.sff
R_2011_07_25_12_12_32_sc21_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_26_13_30_02_frontend2_fullP
rocessing/reads/
R_2011_07_25_12_12_32_sc21_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_26_13_30_02_frontend2_fullP
rocessing/reads/1.454Reads.fna
R_2011_07_25_12_12_32_sc21_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_26_13_30_02_frontend2_fullP
rocessing/reads/2.454Reads.fna
R_2011_07_25_12_12_32_sc21_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_26_13_30_02_frontend2_fullP
rocessing/reads/2.454Reads.qual
R_2011_07_25_12_12_32_sc21_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_26_13_30_02_frontend2_fullP
rocessing/reads/1.454Reads.qual
R20110727sc17_sffread.tgz
R_2011_07_27_15_26_35_sc17_bonvinl1_400_7075_93821020_912148_18052/
R_2011_07_27_15_26_35_sc17_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_28_12_32_02_frontend2_fullP
rocessing/
R_2011_07_27_15_26_35_sc17_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_28_12_32_02_frontend2_fullP
rocessing/sff/
R_2011_07_27_15_26_35_sc17_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_28_12_32_02_frontend2_fullP
rocessing/sff/G6MC6LA02.sff
For Life Science Research use only. Not for diagnostic purposes
R_2011_07_27_15_26_35_sc17_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_28_12_32_02_frontend2_fullP
rocessing/sff/G6MC6LA01.sff
R_2011_07_27_15_26_35_sc17_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_28_12_32_02_frontend2_fullP
rocessing/reads/
R_2011_07_27_15_26_35_sc17_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_28_12_32_02_frontend2_fullP
rocessing/reads/1.454Reads.fna
R_2011_07_27_15_26_35_sc17_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_28_12_32_02_frontend2_fullP
rocessing/reads/2.454Reads.fna
R_2011_07_27_15_26_35_sc17_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_28_12_32_02_frontend2_fullP
rocessing/reads/2.454Reads.qual
R_2011_07_27_15_26_35_sc17_bonvinl1_400_7075_93821020_912148_18052/D_2011_07_28_12_32_02_frontend2_fullP
rocessing/reads/1.454Reads.qual
For Life Science Research use only. Not for diagnostic purposes
The following is a description of all the files listed above
File Name
Description
region.454Reads.fna
FASTA file of the individual sequence reads.
region.454Reads.qual
Corresponding quality score values for each base in the
sequence reads.
sfffilename.sff
Sequence Flowgram Format (SFF) file that represent all
quality filtered sequences. It contains information on 454
flowgram signals, basecalls and quality scores and can be
used to compile a package suitable for submission to the
NCBI trace archive. A description of the SFF format can be
found at NCBI at
http://www.ncbi.nlm.nih.gov/Traces/trace.cgi?cmd=show&f=formats&m=doc&s=formats
For Life Science Research use only. Not for diagnostic purposes
Download