Poster

advertisement
Development of a NMR-based Metabolomics Analysis Methodology for Toxicology
Jahns,
1BAE
1
G.L. ,
Reo,
2
N.V. ,
Kent,
2
M.N. ,
Burgoon,
3
L.D. ,
Zacharewski,
3
T.R. ,
DelRaso,
4
N.
Systems, San Diego, CA 92123, 2Department of Biochemistry & Molecular Biology, Wright State University, Dayton, OH 45429,
3Department
of Biochemistry & Molecular Biology, Michigan State University, East Lansing, MI 48824, 4Human Effectiveness Directorate, Air
Force Research Laboratory, Wright Patterson AFB, OH 45433
region 68
3
7
PCA scores->significant data
8
nonparametric significance test
9
determine fold, abs. change
Sample breakdown
and notation:
•
•
•
•
Above: cross correlation between samples
C8(168 hr) and C11(72 hr) in region 69,
21.918 to 23.687 PPM
5
fold change
1
Right: An example of a reduced
spectrum, containing 43,700 points
C8(168hr)
3
4
1
2
3
4
9
chemical shift reduced index
4 controls at 168 hr. (endpoint), labeled C7, C8, C9, C10 (C8 is used as reference)
5 controls at 72 hr., labeled C11, C12, C13, C14, C15
4 treatments at 168 hr., labeled T1, T2, T3, T4
3 treatments at 72 hr., labeled T15, T16, T17
2
absolute change
T72
lag (PPM)
The experimental change for the 1045 significant spectral channels is characterized as either a relative (fold) change (left panel
below) or an absolute change (right panel below) for each of the 4 pairwise comparisons. The vertical index scale tracks back to
the chemical shift value of the spectral channel. This can be used to identify metabolites contributing to the experimental effect.
C72
• spectra were divided into 78 regions (varying from 0.18 to
3.27 PPM in width) encompassing 43,700 of the original
131,072 points
change magnitude
do PCA, cluster conditions
extremes:
p-value = 0.036
C72
5
extremes:
p-value = 0.029
C168
concatenate -> reduced spectrum
4
Measuring Experimental Effects
In Practice:
T72
4
3
Mann-Whitney test score
C72
align all samples in each region
extremes:
p-value = 0.057
• assemble reduced spectra by concatenating regions offset by
the lag with maximum cross correlation, leaving out long
segments of noise floor
T168
integrate intensities in each bin
extremes:
p-value = 0.016
To quantify the experimental effect at each channel,
we next measure the change in each of the 4
pairwise comparisons
C168
3
2
T168
• calculate the cross correlation of each spectrum with the
reference as a function of lag (i.e., offset) for each region
1
C168
cross correlation value
• choose 1 (of the 16) spectra as reference
T72
subdivide spectrum into regions
6
• divide spectra into short regions with signal bounded by
segments of noise floor
Distribution of Mann-Whitney scores for each pair of experimental conditions
T72
Above: An example showing 2 of the selected regions
for 4 spectra, one from each experimental condition.
8
• The range (across the 16 samples) of contribution
from each spectral channel to the first and second
principal component score is determined
• The top contributors that account for > 90% of the
observed separation are retained; this is 1089
unique spectral channels
• The nonparametric Mann-Whitney test is applied
to each of the 1089 unique spectral channels for
each of the 4 pairs of experimental conditions.
Extreme test score values indicate that all
samples of one condition are separated from all
samples of the other condition. It is found that
1045 of the 1089 channels satisfy this condition
for at least one of the 4 pairs of comparison.
T168
choose bin size
4
C72
bin?
T(72hr)
C168
no
3
4
spectral channel index
yes
2
2
Controls at 72 hr. referenced to Controls at 168 hr. (time effect)
Treatments at 72 hr. referenced to Treatments at 168 hr. (time effect)
Treatments at 168 hr. referenced to Controls at 168 hr. (treatment effect)
Treatments at 72 hr. referenced to Controls at 72 hr. (treatment effect)
T168
chemical shift (PPM)
max c = 0.9202
The work reported in our abstract has
been extended in two primary
directions:
1. Integration of intensity values
(“binning”) across a span of chemical
shifts that is large compared to
misalignment addresses this issue but
can mask the response of multiple
constituents within a bin. Our
extended work retains the original
instrumental resolution and addresses
misalignment through a regional
alignment procedure.
2. The study compares controls and
treatments at multiple time points. We
have adopted a principal-componentsanalysis (PCA) approach to find multidimensional metrics of experimental
effect.
establish noise baseline
lag = 0.069 PPM
normalized amplitude (arb.)
2
lag = 0.002 PPM
Strategy:
Description of full experiment and standard
analysis is described in poster #738
normalize to total intensity
lag = 0 PPM
lag = 0.069 PPM
remove extrinsic peaks
1
T1(168hr)
T17(72hr)
4
1
PC 1
• varies across spectrum, but is not a simple linear
correction
The hepatic metabolomic response of immature ovariectomized C57Bl/6 mice to 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD)
was examined using thin layer chromatography (TLC), 13C, 31P, and 1H NMR (14.1 T), and high pressure liquid chromatography
(HPLC). Mice were treated with either sesame oil or 30 ug/kg TCDD by gavage and sacrificed at 168 hr. Treatment induced a
significant increase in liver weight with marked cytoplasmic vacuolization accompanied by individual cell apoptosis and
inflammation. Oil Red O staining indicated vacuolization was due to lipid accumulation and TLC analysis of lipid extracts
revealed a 2.5-fold increase in triglycerides. The work reported here focuses on practical issues with analysis of the 13C lipid
spectra and is based on 16 spectra from control and treated animals at 72 and 168 hour time points.
T(168hr)
C(72hr)
lag = 0.080 PPM
Misalignment characteristics:
• identified by same pattern at different PPM values
7
3
1
C11(72hr)
lag = 0.080 PPM
Significant spectral contributions to the separation found in the PCA
scores plot are determined pairwise for the 4 combinations shown
schematically at left:
C(168hr)
(correlation reference)
Experiment Design and Issues
acquire spectra
region 69
number of occurrences
As has been described for 1H NMR spectra, 13C
resonances are subject to additional frequency shifts
that can cause problems with misalignment of peaks
(“positional noise”). In lipid samples from tissue
extracts, the 13C peak positions are dependent upon
sample concentration and the composition of lipids
present. Lipid composition is a true biological effect
that is part of the positional noise.
• can be as much as 0.14 PPM
Methodology
outline:
Determining Significant Contributors from PCA Scores
C8(168hr)
normalized amplitude (arb.)
Metabolomics is the simultaneous measurement of metabolites from endogenous and exogenous chemicals, which
may be used to identify putative biomarkers of exposure and toxicity. Currently, most metabolomics studies focus on
using pattern recognition techniques to cluster spectrometric peaks, but most fail to statistically identify peaks
associated with exposure. We have developed a data analysis and processing methodology for Nuclear Magnetic
Resonance (NMR) spectrometry to 1) identify and eliminate spectral regions with no signal, 2) statistically
characterize the significance of differentially expressed metabolite signals, and 3) quantify the change in these
signals. The method identifies spectral regions with no signal by scanning spectra with a low-level threshold.
Detection Theory is used to produce probabilistic estimates of the presence of a treatment effect, based on either a
minimum Bayesian risk cost or a constant false alarm rate. The treatment effect is then quantified by either absolute
or relative (fold) changes of the significant bins. As an example, hepatic lipid extracts from mice dosed with 2,3,7,8tetrachlorodibenzo-p-dioxin (TCDD) were analyzed using 13C NMR. Noise screening eliminated channels with no
signal in both control and treatment replicates, reducing active bins from 1024 to 192. The Bayesian-cost
significance metric further reduced the data to 77 channels with a high probability of treatment effect. We ranked
these bins both by absolute and by fold change to identify channels showing the largest effect. These results are
valuable as they stand, or can serve as a screened basis for further classification and identification analysis.
Funded by NIEHS RO1 ES013927
Alignment and Size Reduction Procedure
PC 2
Abstract
Principal Component Analysis of Reduced Spectra
• The reduced spectra from all samples were concatenated into a 43700 x 16 data matrix
• The data matrix was centered by subtracting the mean data vector, then singular value decomposition was performed
13C
1
CH3OH
The spectrum at lower left expands the amplitude scale to
show the noise level. A histogram of regions containing only
noise (55075 points out of the total 131072) is shown below.
The standard deviation is established for each sample for use
when calculating significant changes in step 9
2
chemical shift (PPM)
The excellent clustering of the data demonstrates that the regional alignment and size reduction methodology
described above has preserved the information on experimental effect in the data.
s = 0.0036
Conclusions
6
• The methodology outlined here successfully maintains NMR spectral resolution while dealing
with “positional noise”
C(168hr)
PC 2
normalized amplitude (arb.)
chemical shift (PPM)
NMR spectrum of lipid extract. Light
blue peaks are extrinsic solvents.
• The principal component scores were calculated for the 16 samples. A scatterplot of principal component 2 scores
vs. principal component 1 scores (right panel below) shows that the data cluster by experimental conditions. The
centers of the 4 clusters are indicated by the + data points
Singular value
CDCl3
• The singular values (left panel below) show that the first two principal components dominate
The 2 solvent peaks shown at left are removed by zeroing the
amplitude in a region around each peak (The CDCl3 peak is
actually a triplet)
number of occurrences
normalized amplitude (arb.)
Data Preprocessing
T(168hr)
C(72hr)
T(72hr)
Singular value index
normalized amplitude (arb.)
• Principal Component Analysis of processed spectra results in excellent separation of
experimental conditions and provides a means of assessing the significance of the original
spectral contributions
PC 1
Excellent separation of the 4 experimental conditions is observed in the first two principal component scores
• Future work will analyze more extensive data and address identification of the observed
significant spectral contributions
Supported by NIH / NIEHS Grant R01 ES013927
● E-mail: gary.jahns@baesystems.com
● http://www.alphatech.com/primary/index.htm
● http://dbzach.fst.msu.edu
Download