Supplementary Materials and methods (doc 52K)

advertisement
Supplementary materials and methods
Genome sequencing
DNA from Vc450 was extracted using the CTAB method of Wilson et al. (Wilson, 1994). The
genome was sequenced, assembled, and finished at the Joint Genome Institute (Los Alamos,
NM). Draft sequences were obtained from a blend of Sanger and 454 sequencing and involved
paired-end Sanger sequencing on 8 kb plasmid libraries to 5X coverage and 20X coverage of 454
sequences. All libraries together provided 6.5x coverage. The general aspects of library
construction and sequencing performed at the JGI can be found at http://www.jgi.doe.gov/. To
finish the genome, a collection of custom software and targeted reaction types were used. The
Phred/Phrap/Consed software package (www.phrap.com) was used for sequence assembly and
quality assessment (Ewing and Green, 1998; Ewing et al., 1998; Gordon et al., 1998). After the
shotgun stage, reads were assembled with parallel phrap (High Performance Software, LLC).
Possible misassemblies were corrected with Dupfinisher (Han and Chain, 2006) or transposon
bombing of bridging clones (Epicentre Biotechnologies, Madison, WI). Gaps between contigs
were closed by editing in Consed, custom primer walk or PCR amplification (Roche Applied
Science, Indianapolis, IN). Gene-finding and annotation were achieved using the RAST server
based on subsystem technology (Aziz et al., 2008). The V. coralliilyticus ATCC BAA-450
genome was deposited in the GenBank database under accession number ACZN00000000.
Genes are identified in the text by locus tags, assigned by the RAST system, and GenBank
protein accession numbers are cross-referenced for corresponding locus tags in the Supporting
Online Materials (table S1).
2D-LC-MS/MS
Vc450 peptides were fractionated by SCX chromatography in a 2.1 mm PolySULFOETHYL
ATM ion-exchange column (PolyLC, Columbia, MD). The peptides were separated at a flow rate
of 200 L/min with the following 100 min gradient: 10 mmol/L to 120 mmol/L ammonium
formate over 80 min and 120 mmol/L to 500 mmol/L ammonium formate over the remaining 20
min. Acetonitrile (250 mL/L) and formic acid (1 mL/L) were constant throughout the gradient.
In the first experiment, fractions from each sample were collected at a rate of 1 fraction/min
(total time = 100 min). Subsequently, fractions 1-5 were pooled together, and fractions 77-100
were pooled in pairs (e.g., 77 and 78, 79 and 80, etc…), creating a total of 83 fractions. Each of
the 83 fractions was further analyzed by LC-MS/MS using a reverse-phase C18 1 mm column
(Waters, Milford MA) on a LTQ linear ion trap mass spectrometer (Thermo Fischer Scientific,
San Jose CA). A flow rate of 40 L/min was employed in a gradient of 0.02 L/L to 0.3 L/L%
acetonitrile with 1 mL/L formic acid over 75 min which was increased from 0.3 L/L to 0.9 L/L
between 75 and 90 min. For the second experiment, fractions were collected at a rate of 1
fraction/ 6.7 min (total time = 100 min). Each fraction was analyzed by nano-LC-MS/MS using
a reverse-phase C18 75 m column (Microtech Scientific, Anaheim CA) on a LTQ linear ion
trap mass spectrometer (Thermo Fischer Scientific, San Jose CA). A flow rate of 30 L/min was
employed with a gradient of 0.02 L/L to 0.6 L/L% acetonitrile with 1 mL/L formic acid over 120
min, which was increased from 0.6 to 0.9 L/L between 120 and 140 min.
Protein Identification
Mass spectra from both experiments were matched to predicted tryptic peptides from the Vibrio
coralliilyticus genome (ACZN00000000) using Turbo SEQUEST (Eng et al., 1994). A fragment
ion mass tolerance of 1.00 Da and a parent ion tolerance of 2.0 Da were used. Additionally, the
iodoacetamide derivative of cysteine and oxidation of methionine were specified in Turbo
SEQUEST as variable modifications. Data were filtered using a peptide probability < 1X10-3,
cross correlation scores (Xcorr) >1.0/2.0/2.5 for +1/+2/+3 ions, respectively, and a DeltaCN of
0.1. A minimum of two unique peptides was used for the purposes of protein identification.
These parameters resulted in false positive rates for peptide identifications of less than 0.1%,
based on matches to the reversed protein sequences. SEQUEST search result files (.srf) of the
combined dataset were loaded into Scaffold (Proteome Software Inc., Portland, OR; version
Scaffold_2_05_01) for validation of peptide and protein identifications. Peptide identifications
were accepted using the Peptide Prophet algorithm (Keller et al., 2002) at a >95% cutoff.
Protein identifications were accepted using the Protein Prophet algorithm (Nesvizhskii and
Aebersold, 2004) at a >99% cutoff in combination with a two peptide requirement. Proteins that
contained similar peptides and could not be differentiated based on MS/MS analysis alone were
grouped to satisfy the principles of parsimony as determined by Scaffold. That is to say, shared
peptides were counted once in the protein, with the highest probability based on Peptide and
Protein Prophet.
Spectral Counting
Undersampling (Delmotte et al., 2009) and limited dynamic range (Domon and Aebersold, 2006)
are known limitations in shotgun approaches to whole proteome analysis; therefore, in order to
identify all possible proteins from the data, spectral counts for both the 83 and 15 fraction 2DLC-MS/MS experiments were combined in Scaffold (Proteome Software Inc., Portland, OR;
version Scaffold_2_05_01). Only the spectral counts for proteins identified by Turbo SEQUEST
and validated by Scaffold analysis were used for comparison. We used the G-test of
independence, a likelihood ratio test for discreet data, to quantify the relative expression of
proteins between V. coralliilyticus grown at 24 and 27C. Spectral counts were normalized
according to the total number of spectral counts for both data sets, as suggested by Old et al. (Old
et al., 2005) and Hendrickson et al. (Hendrickson et al., 2006), using the following equations:
O24 = n24 + λ and O27 = n27 + (t24/t27) + λ; where O24 and O27 are the normalized observed spectral
counts for a given protein at either 24 and 27C, n24 and n27 are the actual spectral count for a
given protein, t24 and t27 are the total spectral counts for the entire dataset, and λ is a set value
(0.5) used to avoid taking the logarithm on zero. Normalized spectral counts were used to
generate a G statistic using the following equation: G = [2 * O24 * ln(O24/E24)] + [2 * O27 *
ln(O27/E27)]. We assumed protein expression is equal, therefore, E24 = E27 = (O24 + O27) / 2. The
G statistic is approximately distributed as χ2 with one degree of freedom (Sokal and Rohlf,
1994). This allowed us to calculate a p-value to distinguish proteins and virulence categories
that are differentially expressed at 24 and 27C. The strict parameters used for protein
identification in conjunction with the tendency to overlook low-abundance proteins using
spectral counting (Zhou et al., 2010) most likely resulted in a conservative estimate of the
proteins present in this study.
Bioluminescence assays
Vc450 was cultured as described above with 1 mL of 2.4 OD610 V. coralliilyticus inoculum added
to 99 mL GASW media. Individual cultures were grown in triplicate at 21, 24, 27, 30, 33 or 37
C and 1.5 mL of cell culture was collected from each sample after 3, 12 and 24 h to represent
exponential, early stationary and late stationary growth phases, respectively. The cell cultures
were centrifuged at 14,000 rpm for 5 min and the supernatant was then filtered using a 0.2 µm
filter. The filtered supernatants were kept at 4 C for two weeks during which triplicate
luminescence bioassays were performed. The V. harveyi reporter strains, BB886 (ATCC BAA1118™, luxPQ::tn5kAN) and BB170 (ATCC BAA-1117™, luxN::tn5kAN), were used to
determine the presence of AI-1 and AI-2 signaling molecules, respectively. A simplified
autoinducer bioassay (AB) medium (Bassler et al., 1994; Turovskiy & Chikindas, 2006) was
used containing 0.3 M sodium chloride, 0.005 M magnesium sulfate, and 0.2% vitamin-free
casamino. This AB medium was brought up to a 7.5 pH using 1N sodium hydroxide and
autoclaved. After cooling to room temperature the following filter sterilized components were
added to complete the AB medium: 1% of 1 M potassium phosphate buffer (pH 7.0), 1% of 0.1
M L-arginine solution, and 2% of 50% glycerol. The luminescence bioassays were performed as
described by Bassler et al. (1994). Briefly, the reporter strains were grown at 30 C for 16 h in
AB medium to an OD610 of 1.0, diluted 1:5000 in fresh AB medium, and 180 µL of the diluted
reporter strains were added to each well of a 96-well plate. Cell-free V. coralliilyticus
supernatant was added to the reporter strains at 10% (or 20 µL) and the plates were incubated in
a rotary shaker at 130 rpm and 30 C. Luminescence measurements were taken every hour of the
five hour incubation period using a Luminometer (BMG Novostar). The measurements taken
after the 5 h incubation time point were normalized to background controls (i.e., reporter strains
with sterile media added) and presented as the fold change compared to endogenous levels of
luminescence expressed by the reporter strains.
References
Aziz, R.K., Bartels, D., Best, A.A., DeJongh, M., Disz, T., Edwards, R.A., et al. (2008). The
RAST server: Rapid annotations using subsystems technology. BMC Genomics 9, 1-15.
Darling, A.C.E., Mau, B., Blattner, F.R., Perna, N.T. (2004). Mauve: Multiple alignment of
conserved genomic sequence with rearrangements. Genome Res 14, 1394-1403.
Delmotte, N., Lasaosa, M., Tholey, A., Heinzle, E., Dorsselaer, A., Huber, C.G. (2009).
Repeatability of peptide identifications in shotgun proteome analysis employing off-line
two dimensional chromatographic separations and ion trap MS. J Sep Sci 32, 1156-1164.
Domon, B., Aebersold, R. (2006). Mass spectrometry and protein analysis. Science 312, 212217.
Eng, J.K., McCormack, A.L., Yates, J.R. (1994). An approach to correlate tandem mass spectral
data of peptides with amino acid sequences in a protein database. J Am Soc Mass
Spectrom 5, 976-989.
Ewing, B., Green, P. (1998). Base-calling of automated sequencer traces using phred. II. Error
probabilities. Genome Res 8, 186-194.
Ewing, B., Hillier, L., Wendl, M.C., Green, P. (1998). Base-calling of automated sequencer
traces using phred. I. Accuracy assessment. Genome Res 8, 175-185.
Gordon, D.C., Abajian, C., Green, P. (1998). Consed: a graphical tool for sequence finishing.
Genome Res 8, 195-202.
Han, C.S., Chain, P. (2006). Finishing repetitive regions automatically with Dupfinisher. In
Proceedings of the 2006 International Conference on Bioinformatics & Computational
Biology, Arabnia, H.R., Valafar, H., eds. (Las Vegas, NV, CSREA Press), pp. 142-147.
Hendrickson, E.L., Xia, Q., Wang, T., Leigh, J.A., Hackett, M. (2006). Comparison of spectral
counting and metabolic stable isotope labeling for use with quantitative microbial
proteomics. Analyst 131, 1335-1341.
Keller, A., Nesvizhskii, A.I., Kolker, E., Aebersold, R. (2002). Empirical statistical model to
estimate the accuracy of peptide identifications made by MS/MS and database search.
Analy Chem 74, 5383-5392.
Nesvizhskii, A.I., Aebersold, R. (2004). Analysis, statistical validation and dissemination of
large-scale proteomics datasets generated by tandem MS. Drug Discov Today 9, 173-181.
Old, W.M., Meyer-Arendt, K., Aveline-Wolf, L., Pierce, K.G., Mendoza, A., Sevinsky, J.R.,
Resing, K.A., Ahn, N.G. (2005). Comparison of lable-free methods for quantifying
human proteins by shotgun proteomics. Mol Cell Proteomics 4, 1487-1502.
Overbeek, R., Begley, T., Butler, R.M., Choudhuri, J.V., Chuang, H.Y., Cohoon, M., et al.
(2005). The subsystems approach to genome annotation and its use in the project to
annotate 1000 genomes. Nucleic Acids Res 33, 5691-5702.
Sokal, R.R., Rohlf, F.J. (1994). Biometry: the preinciples and practices of statistics in biological
research, 3rd edition: Freeman and Co., New York.
Tamura, K., Dudley, J., Nei, M., Kumar, S. (2007). MEGA4: Molecular evolutionary genetics
analysis (MEGA) software version 4.0. Mol Biol Evol 24, 1596-1599.
Wilson, K. (1994). Preparation of genomic DNA from bacteria, In: Ausubel, R.B.F.A.,
Kingston, R.E., Moore, D.D., Seidman, J.G., Struhl, K. (Eds.) Current Protocols in
Molecular Biology. John Wiley & Sons, New York, pp. 1-5.
Zhou, J.Y., Schepmoes, A.A., Zhang, X., Moore, R.J., Monroe, M.E., Lee, J.H., Camp, D.G.,
Smith, R.D., Qian, W.J. (2010). Improved LC-MS/MS spectral counting statistics by
recovering low-scoring spectra matched to confidently identified peptide sequences. J
Proteome Res 9, 5698-5704.
Download