Present

advertisement
Coupling near infrared spectroscopy and
chemometrics for food and drug authentication
Federico Marini
Or better….
Marini - WSC8
Outline
•
•
•
•
•
Nutritional quality of cereals
Food contamination by mycotoxins
Traceability of foodstuff
Quantification of nutrients in baby powdered milk
Determination of ee in drugs
Marini - WSC8
Nutritional quality of cereals
• Oat (Avena sativa) is considered one of the most important
grain cereals for human consumption.
• Indeed, oat products are important sources of dietary fiber,
β-glucan, good nutritional value proteins, vitamins and
other components,which are demonstrated to be beneficial
for human health.
• With the aim to assess the nutritive potential value in new
naked oat genotypes during breeding work, this study
focuses on the possibility of developing a rapid, accurate
and precise alternative method for the simultaneous
quantification of β-glucan and protein content in naked oat
samples.
Marini - WSC8
The data set
• The whole data set comprises 168 naked oat samples from
12 varieties, originally coming from Italy and other
European countries and being representative of a large
genetic range.
• 166 samples analysed as flour by NIR spectroscopy
• 54 samples analyzed as flour by NIT spectroscopy
• 168 samples analyzed as whole grain by NIT spectroscopy
• Robust calibration models built by Partial Robust MRegression
Marini - WSC8
Data set split
Marini - WSC8
NIR on flour
b-glucan
Protein content
Marini - WSC8
Did we need robust methods?
• The significant amount of weights <1 both for horizontal and vertical
outlyingness is an indication that the choice of using robust calibration
Marini - WSC8
Results
• Reflectance measurements on flour seem to be the best
experimental setup.
• However, a different test set was used in the three
experiments.
• Model building repeated on a common data set.
Marini - WSC8
Comparison of the three spectroscopic setups
• Analysis was carried out only on the 54 samples analyzed by
the three different spectroscopic approaches
• Clearer evidence of NIR on flour being the better setup
Marini - WSC8
Outline
•
•
•
•
•
Nutritional quality of cereals
Food contamination by mycotoxins
Traceability of foodstuff
Quantification of nutrients in baby powdered milk
Determination of ee in drugs
Marini - WSC8
Micotoxin
• Micotoxins are products of secondary metabolism of
pathogenic fungi
• They are among the highest impact contaminants for cereal
cultures.
• Attention was focused on DON (a micotoxin produced by
fungi of the species Fusarium)
• EU limit of 1750 ppb for wheat to be considered
contaminated.
• NIR-based approach for quantification and/or assessment of
the contamination status.
Marini - WSC8
Data set
• More than 150 samples analyzed at least in replicate by NIR
and NIT.
• 45 samples left aside as independent test set.
Marini - WSC8
Calibration
Marini - WSC8
• Best pretreatment: MSC + 2nd derivative
• RMSEC=17.63; RMSEP=17.82
Classification
• Best pretreatment: MSC + 2nd derivative
• Classification accuracies:
– 96.4% Cont.; 92.5% non cont. (calibr)
– 80% cont.; 100% non cont. (test)
Marini - WSC8
Outline
•
•
•
•
•
Nutritional quality of cereals
Food contamination by mycotoxins
Traceability of foodstuff
Quantification of nutrients in baby powdered milk
Determination of ee in drugs
Marini - WSC8
Traceability: introduction
• Labelling issues are of increasing concern.
• Growth and promotion of “added value” regional foods such as
those produced under “Organic” and “Designated Origin” labels.
• Many labelling claims that relate to perceived added value are
rarely supported by analytical data, leaving regulators to rely
solely on paper auditing procedures to monitor compliance.
• Need for analytical specifications for labelling issues relating to
food origin:
– geographical origin,
– production origin
– species origin.
Marini - WSC8
Tracing the origin of foodstuff
• The assessment of the typicalness of a product and its traceability should
imply an analytical method to determine the origin of the sample.
• Unfortunately, even if a great host of instrumental analytical techniques
are at present under investigation, no one of those can be listed whose
results can be directly related to the origin of the samples.
An alternative way to cope with this problem is to use
mathematical and statistical methods (chemometrics) to
process the results of a set of determination performed on
the samples in order to obtain the desired classification.
Marini - WSC8
An example: olive oil
• Authentication of the origin of olive oil samples
• 57 extra virgin olive oil samples
– 20 from Sabina, Lazio (13 harvested 2009, 7 harvested 2010)
– 37 samples of different origin (22 from 2009, 15 from 2010
• MIR and NIR spectra recorded on each sample
Marini - WSC8
Training/test set selection
• Duplex algorithm repeated class-wise on each pretreatment separately (Split
ratio: 2/1)
• Data selected more than 10 times (out of 15) in test set
Marini - WSC8
PLS-DA on MIR data
Pretreatment
Linear baseline
Quadratic baseline
st
1 derivative (SG)
nd
2 derivative (SG)
MSC
MSC + quadratic baseline
st
MSC + 1 derivative
nd
MSC + 2 derivative
LV
6
6
7
3
3
4
6
3
% Correct Classification
Calibration
Sabina
Other origins
100.0
100.0
100.0
100.0
100.0
100.0
84.6
86.4
100.0
95.5
100.0
95.5
100.0
100.0
84.6
86.4
• Best results with MSC + quadratic bl.
• %cc on test set: 85.7% (sabina); 86.7% (other origins)
Marini - WSC8
% Correct Classification
Cross-validation
Sabina
Other origins
92.3
86.4
92.3
86.4
84.6
86.4
84.6
72.7
84.6
95.5
92.3
95.5
84.6
86.4
84.6
68.2
PLS-DA on NIR data
Pretreatment
MSC
Detrending
st
1 derivative (SG)
nd
2 derivative (SG)
MSC + detrending
st
MSC + 1 derivative
nd
MSC + 2 derivative
LV
3
4
5
3
4
4
4
% Correct Classification
Calibration
Sabina
Other origins
100.0
95.5
100.0
95.5
100.0
95.5
92.3
81.8
100.0
95.5
92.3
95.5
84.6
90.9
% Correct Classification
Cross-validation
Sabina
Other origins
100.0
95.5
100.0
95.5
100.0
95.5
76.9
86.4
100.0
95.5
92.3
90.9
84.6
86.4
• Best results in CV with 4 pretreatments.
• %cc on test set (d1): 100% (sabina); 100% (other origins)
• %cc on test set (other 3): 100% (sabina); 93.3% (other origins)
Marini - WSC8
Spectral interpretation
• All the spectral regions identified as relevant correspond
to significant NIR features:
– the bands at around 4450-5000 cm-1, which may be attributed to
combination bands of C=C and C-H stretching vibration of cis
unsaturated fatty acids,
– the bands between 5650 and 6000 cm-1, due to the combination
bands and first overtone of C-H of methylene of aliphatic groups
of oil,
– and those between 7074 and 7180 cm-1, corresponding to C-H
combination band of methylene.
Marini - WSC8
SIMCA on MIR data
Pretreatment
Linear baseline
Quadratic baseline
1st derivative (SG)
2nd derivative (SG)
MSC
MSC + quadratic baseline
st
MSC + 1 derivative
nd
MSC + 2 derivative
PC
3
4
4
4
4
4
3
4
Calibration
% Sensitivity
100.00
100.00
100.00
100.00
100.00
100.00
100.00
100.00
% Specificity
72.73
81.82
72.73
63.64
77.27
81.82
63.64
63.64
Cross-validation
% Sensitivity
76.92
61.54
84.62
69.23
69.23
69.23
92.31
76.92
• Sensitivity decreases in CV.
• Best model with d1 (based on geometric average of sens & spec)
• Test set: 71.43% (sens.); 73.33% (spec.)
Marini - WSC8
% Specificity
81.82
95.45
81.82
68.18
86.36
86.36
63.64
68.18
SIMCA on NIR data
Pretreatment
MSC
Detrending
st
1 derivative (SG)
nd
2 derivative (SG)
MSC + detrending
st
MSC + 1 derivative
nd
MSC + 2 derivative
PC
4
4
3
5
3
3
4
Calibration
% Sensitivity % Specificity
100.00
100.00
100.00
95.45
100.00
90.91
100.00
100.00
100.00
90.91
100.00
95.45
100.00
100.00
Cross-validation
% Sensitivity % Specificity
61.54
100.00
76.92
95.45
69.23
90.91
61.54
100.00
76.92
90.91
76.92
95.45
69.23
100.00
• Best model with det & MSC+d1 (based on geometric avg. of sens & spec)
• Test set (det): 100% (sens.); 93.33% (spec.)
• Test set (MSC+d1): 71.43% (sens.); 86.67% (spec.)
Marini - WSC8
Effect of year
(SIMCA NIR)
Pretreatment
MSC
Detrending
st
1 derivative (SG)
nd
2 derivative (SG)
MSC + detrending
st
MSC + 1 derivative
nd
MSC + 2 derivative
PC
3
3
3
4
3
2
3
Calibration
% Sensitivity % Specificity
100.00
68.18
100.00
100.00
100.00
100.00
100.00
86.36
100.00
100.00
100.00
95.45
100.00
86.36
• Best model with det, d1 & MSC+det
• Models highly sensitive and specific in cal. &CV
• Test set (2010 samples): high specificity, no sensitivity
Marini - WSC8
Cross-validation
% Sensitivity % Specificity
76.92
81.82
84.62
100.00
84.62
100.00
61.54
90.91
84.62
100.00
84.62
95.45
84.62
90.91
Effect of year (Coomans)
• If discriminant classification is sought, still a good classification ability
could be obtained, notwithstanding the diversity
Marini - WSC8
A second example: pistachio nuts
• The major pistachio-producing countries of the world are, in order,
Iran, USA (California), Turkey, Syria, but to lesser extent, other
countries cultivate pistachios as well, between these Italy and India.
• In Italy, only one variety (Bianca) is cultivated mainly in Bronte.
• Italian production is very low in comparison to that of Asia and the
USA; however, it is compensated by the very high quality.
• Each producing country’s applied tariff rates and national laws on
commodities vary dramatically. Therefore, pistachio variation in
quality, food safety (e.g., contamination by aflatoxins), import/export
fees, legal implications, and financial concerns makes determining the
country of origin for pistachios important to protect the consumers
against potential fraud, and there is a need to develop analytical
methods to determine their geographical origin.
Marini - WSC8
Samples
• 483 pistachio samples from the 4 main producing countries + Italy and
India were analyzed (NIR spectra recorded on both halves of nut and
averaged)
Country
Nr. of samples
Italy (Bronte)
41
Iran
121
India
41
Syria
40
Turkey
120
USA
120
Marini - WSC8
Training/test splitting
Country
Nr. of samples
Italy (Bronte)
22+19
Iran
83+38
India
23+18
Syria
25+15
Turkey
86+34
USA
81+39
Marini - WSC8
PLS-DA modeling
Bronte
India
Iran
Syria
Turkey
USA
Calibration
97.48%
96.82%
90.54%
96.47%
95.17%
99.79%
CV
97.32%
95.30%
89.48%
93.97%
93.15%
99.17%
Prediction
95.14%
91.29%
83.59%
93.63%
91.71%
99.19%
Best model:
MSC+detrending
Bronte: red
India: blue
Iran: black
Syria: green
Turkey: cyan
USA: purple
Training set: empty
Test set: filled
Marini - WSC8
Predictions
Marini - WSC8
VIP
Marini - WSC8
SIMCA
• Optimal complexity evaluated as those resulting in the best geometric
average between sensitivity and specificity in CV
Marini - WSC8
SIMCA
• Best model: MSC+detrending
Class
Sensitivity
(Cal)
Specificity
(Cal)
Sensitivity
(CV)
Specificity
(CV)
Sensitivity
(Pred)
Specificity
(Pred)
Bronte
95.45%
95.64%
72.73%
95.30%
89.47%
96.53%
India
100.00%
90.57%
65.22%
93.60%
83.33%
98.62%
Iran
98.80%
68.35%
93.98%
67.93%
92.11%
76.80%
Syria
88.00%
88.81%
88.00%
87.46%
73.33
82.43%
Turkey
95.35%
79.06%
84.88%
80.77%
73.53%
80.62%
USA
93.83%
100.00%
85.19%
100.00%
87.18%
99.19%
Marini - WSC8
SIMCA models
Marini - WSC8
Bronte in detail
Marini - WSC8
Outline
•
•
•
•
•
Nutritional quality of cereals
Food contamination by mycotoxins
Traceability of foodstuff
Quantification of nutrients in baby powdered milk
Determination of ee in drugs
Marini - WSC8
Quantification of nutrients in powdered milk for babies
• Baby powdered milk is a product based on milk of cows or other animals
and/or other ingredients which have been proven to be suitable for infant
feeding.
• The nutritional safety and adequacy should be scientifically demonstrated
to support normal growth and development of infants.
• In addition to the compositional requirements, other ingredients may be
added to ensure that the formulation is suitable as the sole source of
nutrition for the infant, or to provide other benefits that are similar to
outcomes of populations of breastfed babies.
• The suitability for the particular nutritional uses of infants and the safety of
additional compounds added at the chosen levels shall be scientifically
demonstrated.
Marini - WSC8
Samples
• Preliminary results only on lipid content.
Marini - WSC8
BiPLS
…
.
•
•
•
Spectral region is divided into intervals & PLS models are computed
removing one interval at a time.
Interval whose deletion results in lowest RMSECV is removed.
Procedure is iterated up to the desired number of retained variables
8 intervals
1600 variables
Marini - WSC8
7 intervals
1400 variables
6 intervals
1200 variables
GA
cromosom
i
Population of n chromosomes with m genes
P(t)
....
n .
1
a
...
..
.
b
...
..
.
c
..
..
.
d
.e..
.
.
generation
evaluation of the fitness of
chromosomes
ranking of chromosomes according
to fitness
cross-over
mutation
new generation P(t+1)
stop
Marini - WSC8
BiPLS-GA
Method
RMSECV
No sel
1.41
BiPLS
0.96
BiPLS-GA
0.90
Marini - WSC8
Outline
•
•
•
•
•
Nutritional quality of cereals
Food contamination by mycotoxins
Traceability of foodstuff
Quantification of nutrients in baby powdered milk
Determination of ee in drugs
Marini - WSC8
Motivation
• Quite often only one enantiomeric form of an IPA is
pharmacologically active.
• After cases such that of thalidomide, FDA recommends
that the enantiomeric purity of pharmaceutical formulations
is checked.
• Present methods:
– Polarimetry
– HPLC on chiral columns
– NMR
• In this study the possiblity of using NIR+chemometrics to
predict the enantiomeric excess of an IPA is studied.
Marini - WSC8
Ibuprofen: phase diagram
• Phase diagram confirms that ibuprofen crystalizes as a
racemic mixture.
• This suggests that R and S have the same spectrum but
both have a different spectrum than the racemate
Marini - WSC8
Ibuprofen: NIR spectra
Marini - WSC8
Ibuprofen: calibration & testing
•
•
•
•
•
Best pretreatment: MSC + 2nd der
4 LVs
RMSEC=2.11
RMSECV= 2.43
RMSEP=1.71
Marini - WSC8
Interpretation - VIP
• 4000-4800 cm-1: combination of bands from methylenic CH and overtones of the bendings
• 6000 cm-1: First overtone of the aromatic C-H stretching
and of asymmetric stretching of methyls
Marini - WSC8
Marini - WSC8
Download