Supplementary Material A microarray for assessing transcription

advertisement
Supplementary Material
A microarray for assessing transcription from pelagic marine microbial
taxa.
Shilova et al.
Methods
Design of the MicroTOOLs microarray
The seed dataset included sequences from the following strains, if present in
their genomes: alpha-Proteobacterium HIMB5, Crocosphaera watsonii WH8501,
gamma-Proteobacterium HTCC2207, Marinobacter sp. ELN17, Pelagibacter ubique
HTCC1062, Pelagibacter ubique HTCC7211, Prochlorococcus marinus MIT9301,
Prochlorococcus marinus MIT9313, Prochlorococcus marinus CCMP1986,
Pseudomonas stutzeri A1501, Roseobacter GAI101, Rhodobacterales bacterium
HTCC2255, Synechococcus sp. CC9311, Synechococcus sp. CC9605, Synechococcus sp.
WH8102, Trichodesmium erythraeum IMS101, and others. Sequences derived from
clone libraries of genes included amoA, cynA, narB, nifH, nirS, nr, ntcA, glnA, petB,
phnD, rbcL, urtA, and viral genes g20, gp23, mcp, pol, RdRp.
Testing probe specificity in silico: Each target sequence was trimmed at the
start of the first probe and the end of the last probe. The trimmed regions (“probed
region”) were used as queries in BLASTN against several datasets: all target
sequences, “Non-redundant nucleotide”, and “All Prokaryotic Genomes” databases
at CAMERA, and against the SILVA database of ribosomal RNA sequences as of May
2010 and August 2011, respectively (www.arb-silva.de). If the probed region had a
1
non-specific hit (determined as 95% nucleotide similarity hit to something else over
90% region length), each probe in the region was analyzed with BLASTN to
determine and remove the exact non-specific probe(s).
RNA extraction and processing for hybridization to the microarray
RNA was extracted using Ambion® RiboPureTM kit (Life Technologies, Grand
Island, NY) with modifications. The filter was removed from the Sterivex cartridge
and placed into a 2.0 mL microcentrifuge tube with 100 L 0.1 mm diameter glass
beads (Biospec Products, Bartlesville, OK) and 1.0 ml Ambion® TRIzol reagent (Life
Technologies). The tubes were bead-beaten twice for 2 min and centrifuged at
12,000 g for 10 min at 4ºC. The supernatant was transferred into a new 1.5 ml
microcentrifuge tube. Chloroform was added at one fifth of the supernatant volume;
the solution was vortexed for 15 sec, incubated at room temperature for 15 min, and
centrifuged at 12,000 g for 10 min at 4ºC. The consequent RNA purification followed
the RiboPure (Ambion) manufacturers instructions. If the concentration of RNA was
less than 100 ng µl-1, the extract was concentrated by precipitating with 0.1 volume
of sodium acetate (0.3 M final concentration),
each 1.0 ml of
RNA extract, and 2.5 volumes of 100% (v/v) ice-cold ethanol. RNA samples were
treated with RNase-free DNase (QIAGEN, Valencia, CA, USA) for 30 min and purified
using QIAGEN RNeasy Mini Kit.
DNA extraction
DNA was extracted from the organic phase of the nucleic acid extract after RNA
separation using RiboPureTM kit (Ambion) according to the manufacturer’s
instructions. Briefly, DNA was precipitated from the organic phase with 300
2
100% (v/v) ethanol per each 1 mL TRIzol® reagent used in cell lysis, washed three
times with ice-cold 75% (v/v) ethanol, and solubilized with 8 mM sodium
hydroxide. The pH of the DNA solution was adjusted to 8.4 with 1M HEPES buffer.
The quality and quantity of DNA in the extracts were determined with a NanoDrop
1000 (Thermo Scientific) and the 2100 Bioanalyzer using the DNA 7500 kit (Agilent
Technologies). For qPCR, DNA samples were diluted 100-fold, and inhibition tests
were run on all samples.
3
Figures
C
20000
50000
0
20000
50000
50000
0
50000
0
20000
Block_1
50000
0
20000
Block_2
50000
0
20000
Block_3
50000
0
20000
Block_4
0
20000
Block_5
0
20000
50000
0
20000
50000
0
20000
50000
Figure S1. Normalized transcription in test samples obtained with a prototype highdensity oligonucleotide microarray. The prototype microarray was designed
similarly to the MicroTOOLs microarray, but contained 13460 probes
representative of 97 gene categories. The probes were synthesized in five
replications on four-plex (four chips on one slide) of 72000 features (4x72K)
NimbleGen array. Total RNA samples were obtained from cultures of
cyanobacteria: Synechococcus sp. WH8102 (Synechococcus), Crocosphaera watsonii
WH8501 (C. wat), Trichodesmium erythraeum IMS101 (T.ery), and a mix of total
RNA from three cultures. Environmental samples were obtained from South Pacific
Ocean stations 9, 11, 12, and 17 during KM0704 between Fiji and Hawaii in April
2007 (Hewson et al., 2009, LO). RNA extraction and processing were done as
described in this study, except environmental samples were processed as described
4
in Hewson et al., (2009). Hybridization was done at the NimbleGen facility (Iceland).
The results are shown for hybridization of Prochlorococcus probes to Synechococcus
(A top panel), Crocosphaera (A second panel), and to environmental sample from
Station 17 (A bottom panel). Genes are located on X axis and grouped by KEGG
classes. Transcription normalized to median in each sample is shown on Y axis as
log(2) values. Less than 2% of Prochlorococcus-specific probes yielded crosshybridization to Synechococcus-specific probes, and the genes were less than 5%
different on nucleotide level. (B) Distribution of transcription signal for
Crocosphaera probes in culture and environmental samples. Less than 1% of
Trichodesmium-specific probes yielded cross-hybridization to Crocosphaera
transcripts. The detection of transcription in Crocosphaera at Stn. 12 was consistent
with the presence of Crocosphaera cells at this station detected with qPCR
(Moisander et al. 2010). (C) Correlation (Pearson 0.98±0.01) between technical
replicates for environmental sample SP_35016 from the South Pacific Ocean.
Figure S2. Quality control of microarray hybridizations and normalization. (A)
Boxplot of hybridization signals for randomly sampled 10000 probes before
normalization. Y axis is hybridization signal shown as log(2) exponent. (B) Density
distribution of hybridization signals for these probes before normalization. X axis is
hybridization signal shown as log(2) exponent. (C) Boxplot of transcription values
for 19560 genes obtained using RMA algorithm and Li-Wong normalization as
described in Methods. Y axis is normalized transcription shown as log(2) exponent.
(D) Density distribution of transcription values for the 19560 genes. X axis is
5
transcription shown as log(2) exponent. Sample names: P_S1 stands for P-amended
replicate #1 and Fe_S1 stands for Fe-amended replicate #1.
ERCC transcrip on signal, log2
1.1E+7
6.9E+2
ERCC transcript copy, log2
Figure S3. Hybridization signal for the ERCC mRNA spike-in mix (Ambion®) in all
samples. Different colors correspond to different samples. The detection range was
estimated as 700 to 11,000,000 copies of mRNA. The relative sensitivity was The lowest
detected 700 mRNA molecules constitute 1.8E-06% of mRNA of 1000 nt long and
calculated based on following:
A) 1 ug contains 1.88E+12 mRNA molecules of 1000 nt size (average size of a
bacterial mRNA)
B) 400 ng of total RNA was used for cDNA synthesis
C) Considering rRNA as 95% of total RNA: 20 ng of mRNA was used for cDNA
synthesis
D) 20 ng is equivalent to 3.8E+10 mRNA molecules of 1000 nt long
The relative cell sensitivity (the lowest relative abundance of cells within the
community that can be detected) of 0.0025% was based on the assumptions of 1380
mRNA molecules per cell (Neidhardt, 1996) and estimated as 1.8E-06% multiplied by
1380.
A
B
6
Figure S4. (A) Distribution of p values (X axis) for Wilcoxon test for randomly
selected 3000 genes, where N=100, and bandwidth= 0.018. The data for clustering
and W-test was centered and scaled across genes and samples. (B) SAM analysis
observed versus expected scores. Significant genes deviate from
‘expected=observed’ line, and genes up-regulated in P treatment and in Fe
treatments are shown in green (lower left corner) and red (upper right corner),
respectively. Delta=0.677 was selected to find significantly differential genes with
FDR=0.11.
7
Figure S5. (A) Hierarchical clustering of 3742 genes (rows) by transcription across
samples (columns). Blue and red in the heatmap represent down-regulation and upregulation of transcription, correspondingly. Cluster numbers 1-9 in all genes
heatmap are shown on the left, where clusters 1:5, and 9 contain genes up-regulated
in P amendment, and clusters 6, 7, and 8 contain genes up-regulated in Fe
amendment. Transcription of genes up-regulated in the Fe or P amendments by
Phylogroup (B) and Prochlorococcus and Synechococcus Clade (C). Transcription was
normalized to the mean across samples.
8
9
Figure S6. Normalized average transcription for top differentially transcribed genes
for (A) Replication and Cell cycle, (B) Iron stress response, and (C) P metabolism
and stress response. Transcription was normalized to the mean transcription across
samples.
TABLES
File TableS1.xls
Table S1. MicroTOOLs array content: target genes in each microbial group and viruses.
File TableS2.xls
Table S2. Top 3000 genes with detected transcription at Stn ALOHA. Complete data is
available at NCBI GEO. Column names: Control, Phosphorus, and Iron: average
transcription values in control samples with no amendments, P-amended and Fe-amended
treatments, respectively; SD_Control, SD_Phosphorus, SD_Iron – standard deviation for
transcription values for Control, P- and Fe-amended treatments; NCBI_GI: NCBI GI for
the best hit in BLASTN if available; Organism: affiliation to the best hit; Pathway:
KEGG pathway; Source: where nucleotide sequences was obtained from; Per_Ident:
percent identity of gene nucleotide sequence to the organism by BLASTN; Clade: clades
for Synechococcus and Prochlorococcus; Group: phylogroups.
File TableS3.xls
Table S3. Genes with detected transcription at Stn. ALOHA. Column names:
‘Detected transcripts’ stands for counts of detected transcripts for each gene;
‘Detected transcripts %’ is percent of detected transcripts from total detected; ‘Total
probes % is percent of detected transcripts from a total number of probes in the
microarray for this gene.
File TableS4.xls
Table S4. Pearson correlation coefficients for all samples. Tab #1 includes all
detected genes and differential genes from Eukaryota, Prochlorococcus,
Synechococcus, Energy and Nitrogen metabolism as shown in Figure 4. Tab #2
shows correlation coefficients for other groups of genes: all differential genes, genes
from Archaea, Viruses, and genes from phosphorus metabolism, and iron
metabolism.
File TableS5.xls
Table S5. Top 50 differentially transcribed genes identified with LIMMA. Column
names: Fe - average transcription in Fe treatment,, log2; P - average transcription in
P treatment, log2; logFC - log2 fold-change; adj.P.Val - adjusted p value by Benjamini
and Hochberg; ORGANISM – organism identified as a top hit by BLASTN; Gene –
gene name, Annotation – gene annotation; Module - KEGG class; NT % Ident –
BLASTN percent identity of the target sequence to the hit organism; ID - unique
gene ID on the microarray; NCBI ID – NCBI ID if known.
10
File TableS6.xls
Table S6. Differentially transcribed genes for Fe, N, P acquisition and metabolism,
energy metabolism and carbon fixation.
11
Download