Molecular Modeling of Hydrophobic Organic Contaminants Uptake

advertisement
Computer Assisted Structure Elucidation and 3-D Structural
Modeling of “Complex” and “Operationally Defined” Organic
Compounds: Fundamental Concepts and Case Studies
Mamadou S. Diallo 1, 2
1Materials
and Process Simulation Center, Beckman Institute,
California Institute of Technology, Pasadena
2Department
of Civil Engineering,
Howard University, Washington DC
Outline
• Background
• Computer Assisted Structure Elucidation: The
Signature Molecular Descriptor
• Computer Assisted Structure Elucidation: The
SIGNATURE Program
• Case Study : Computer Assisted Structural Elucidation
and 3-D Structural Modeling of Chelsea Soil Humic
Acid
• Summary and Outlook
• Acknowledgments
Background
• Computational chemistry is increasingly being used to
characterize the molecular physical chemistry of organic/inorganic
compounds.
• The starting point of any molecular level investigations of the
physical-chemical behavior of a given compound by computational
chemistry is the bond topology, that is, a list of connection between
all its atoms.
• For “small” and “well” defined organic/inorganic compounds, a
crystal structure or a 2-D structural model are usually available.
• There many cases in chemistry (e. g., environmental chemistry,
petroleum chemistry, soil chemistry, organic geochemistry and the
chemistry of natural products) where the 2-D/3-D “structures”
compounds of interest are not known.
“Operationally Defined” Organic Compounds in
Environmental Chemistry
•
•
•
Humic acids (HAs) are
operationally defined as the
fraction of natural organic
matter that is insoluble in
aqueous solutions at acidic pH
(<2) and soluble in aqueous
solutions at higher pH.
They are ubiquitous in nature. In
terrestrial ecosystems, the
amount of carbon in HAs ( 6.0
1012 tons) exceeds that in living
organisms.
They act as (i) soil stabilizers, (ii)
nutrient and water reservoirs for
plants, (iii) sorbents for toxic
metal ions, radionuclides and
organic pollutants and (iv)
chemical buffers with catalytic
activity.
•
•
A commonly accepted view in the
literature is that HAs are organic
geo-macromolecules formed through
the degradation of plant
biopolymers and/or the
condensation of plants and
microbial degradation products.
However, due to this broad diversity
in structural building blocks and
formation pathways, reliable 3-D
structural models that capture the
specific chemistry for HAs from a
given source have yet to be achieved
despite two centuries of
investigations.
“Operationally Defined” Organic Compounds in
Petroleum Chemistry
• Asphaltenes are operationally
defined as the non-volatile
fraction of petroleum that is
insoluble in n-alkanes and
soluble in aromatic solvents.
• The precipitation of
asphaltenes can cause such
severe problems as reservoir
and pipeline plugging.
• The adsorption of asphaltenes
at oil-water interfaces has
been shown to drastically
increase the stability of waterin-oil (W/O) emulsions
generated during petroleum
recovery by waterflooding.
• Asphaltenes also adversely
impact oil refining. They can
promote coke formation,
deactivate catalysts and are the
main components of vacuum
residua.
• Most of the scientific and
technological challenges
associated with the production
and processing of heavy oils is
directly related to their high
content of non-volatile and
refractory compounds such as
asphaltenes.
Limitations of the Conventional Approach for
Modeling Humic Acids and Asphaltenes
• There are two major
impediments to this
conventional approach.
• First, the structure
elucidation process is carried
out manually
• This may be prohibitively
time consuming for
multifunctional geomacromolecules such as
humic acids and asphaltenes.
• Second and more importantly,
when several isomers can be
built from the same analytical
data set, the conventional
approach does not provide any
means of selecting the
“appropriate” isomer
• Thus, reliable results may be
difficult to achieve when
structural models of HAs
generated with the conventional
approach are used in subsequent
calculations
of
their
physicochemical properties by
computational chemistry.
Computer Assisted Structure Elucidation: The Signature
Descriptor (Faulon, J. Chem. Inf. Comput. Sci., 1994, 34, 1204-121)
• The signature is a systematic codification system over an
alphabet of atom types, describing the extended valence
(i.e.,neighborhood) of the atoms of a molecule.
• For complex organic geo-macromolecules such as humic
acids and asphaltenes, the signature descriptor provides a
simple and robust means of coding:
– (i) elemental analysis data as 0 level atomic signatures,
(ii) quantitative 1H/13C NMR data as 1 or 2 level atomic
signatures, and (iii)
– qualitative data (e.g., molecular fragments and
interfragmentbonds from FT-IR spectroscopy,
qualitative 1-D/2-D NMR spectroscopy, ESI mass
spectrometry, etc.) as 1, 2, or higher level molecular
signatures.
Computer Assisted Structure Elucidation: Signature of an Atom
.
Faulon, J. Chem Inf. Comput. Sci., 1994, 34, 1204-1218
•
•
•
•
A molecule can be represented
by the saturated atomic graph G
= {V, E} where the elements of V
are the atoms and the edges of E
are the bonds
Let v be a vertex of the atomic
graph G = {V, E} and Tl (v) the
spanning subtree of height l
rooted on v.
The l-signature of v is defined
as sl(v) = c{Tl (v)}
Thus, the subtree Tl (v) of the
atomic graph G = {V, E} can be
viewed as a molecular fragment
centered on the atom v reduced
to a limited environment of
radial distance l
Computer Assisted Structure Elucidation: Signature of a
Molecule as a Linear Combination of Its Atomic Signatures
.
(Faulon, J. Chem Inf. Comput. Sci., 2003, 43(3) 707-721) and Dreiding FF Atom Types
Computer Assisted Structure Elucidation: Signature of a
Molecular Fragment
.
(Faulon, J. Chem Inf. Comput. Sci., 1994, 34, 1204-1218)
Computer Assisted Structure Elucidation: Signature of an
Interfragment Bond
.
(Faulon, J. Chem Inf. Comput. Sci., 1994, 34, 1204-1218)
Computer Assisted Structure Elucidation: The Signature
Equation
.
(Faulon, J. Chem Inf. Comput. Sci., 1994, 34, 1204-1218)
l-signatures of molecular
fragments + l-signatures of
interfragment bonds = lsignatures of the unknown
structure
 sl (S) and sl’ (S) are the lsignatures and associated
standard deviations of the
unkown structure
• xi and yj are, repectively, the
quantities of molecular fragment
fi and interfragment bond bj
• I and J are, respectively, the
numbers of molecular fragment fi
and interfragment bond bj
•
Heriarchical Approach for Modeling Humic Susbstances
Experimental Characterization
EA, FT-IR Spectroscopy,
1-D and 2-D 1H/13C NMR Spectroscopy,
Mass Spectrometry, etc
Elements
Types
Amounts
Molecular
Fragments
Interfragment
Bonds
Types
Amounts
Types
Amounts
Computer Assisted Structure
Elucidation
3-D Models
Atomic Simulations Molecular
Dynamics, Molecular Mechanics, etc
Structural
Properties
1H/13C NMR, IR Spectrum, etc
Thermodynamic
Properties
Model Selection
Selection of Reliable
of 3-D Models
Bulk Density,
Solubility Parameter, etc.
Guiding Principles for the Hierarchical Approach for
3-D Structural Modeling of Humic Acids
•
•
•
HAs from different sources (e.g.,
soils, plants, sediments and
streams) have different
structural characteristics.
No single structural model can be
used to describe HAs from
different sources.
Given a set of reliable structural
data, the hierarchical approach
shown in Figure 1 can be used to
generate all the 3-D models that
best match the structural data
for the HA of interest.
•
These models can then be used in
subsequent calculations of their bulk
thermodynamic and structural
properties (e.g., density, solubility
parameter, 13C NMR spectrum etc)
by standard and validated methods
of computational chemistry.
• Only models that yield bulk
thermodynamic and structural
properties in agreement with the
experimental data can be
considered as reliable 3-D
structural models for the HA of
interest.
McCarthy’s First Principles of Humic Substances
• MacCarthy’s “First Principle of Humic Substances” (P. MacCarthy, In
Humic Substances: Structures, Models and Functions E.A. Ghabbour,
G. Davies, Eds. Royal Society of Chemistry Special Publication 273,
2001, pp 19-30.)
• “Humic substances comprise an extraordinarly complex,
amorphous mixture of highly heterogeneous, chemically
reactive yet refractory molecules, produced during early
diagenesis in the decay of biomatter, and formed
ubiquitously in the environment via processes involving
chemical reactions of species randomly chosen from a
pool of diverse molecules and through random chemical
alteration of precursor molecules.”
3-D Structural Modeling of Chelsea Soil Humic Acid
•
•
•
Chelsea soil HA was selected as model
HA to illustrate this new methodology.
The Chelsea HA sample was extracted
from Houghton muck, a Histosol soil
widely found in the Great Lakes
region of the USA [Michigan,
Wisconsin, Minnesota, Illinois,
Indiana and Ohio].
The selection of Houghton muck as
the HA source sample was partially
motivated by the availability of data
on its origin and insight into the
mechanisms of formation of Chelsea
soil HA (USDA-NRCS Soil Survey
Division).
•
•
The native vegetation that led to the
formation of Hougthon consisted
predominantly of grasses, sedges,
reeds, buttonbrush and cattails.
The poor drainage of Houghton
muck, the characteristics of its native
vegetation and the relatively large
mean residence time of organic
matter in Histosol soils (1) suggest
that the condensation of plant
degradation products (e.g., lignin
degradation products, sugars, amino
acids, etc) was a major formation
pathway for Chelsea soil HA.
Experimental Characterization of Chelsea Humic Acid
•
•
•
•
•
Elemental Analysis
Diffuse Reflectance FT-IR Spectroscopy
1-D 13C and 1H Solution NMR Spectroscopy
2-D Solution NMR Spectroscopy (TOCSY and HMQC )
ESI Quadrupole Time-of-Flight Mass Spectrometry
Figure 3: Electrospray ionization (ESI) quadrupole time-of-flight (Q-ToF) mass spectrum
for Chelsea humic acid. The spectrum exhibits the broad distribution of peaks observed in
typical mass spectra of humic substances. It tails at approximately 1200 Dalton thereby
suggesting that higher molecular weight compounds are not significant components or
building block of Chelsea soil humic acid .
Experimental Characterization of Chelsea Humic
Acid: Summary of Results
•
•
•
The organic normalized weight fractions for C (51.31%), H (4.00%), O
(39.67%), N (4.12%) and S (0.90%) and O/C atomic ratio (0.58) for Chelsea
soil HA are typical of soil humic acids
Overall, the results of the DRIFT and 1-D and 2-D solution NMR
spectroscopic experiments are consistent with the hypothesis that the
condensation of plant degradation products (e.g., lignin degradation products,
sugars, amino acids, etc) was a major formation pathway for Chelsea soil HA
The ESI Q-TOF mass spectrum of Chelsea soil HA tails at approximately 1200
Dalton thereby suggesting that higher molecular weight compounds are not
significant components or building blocks of Chelsea soil HA.
Computer Assisted Structural Elucidation of Chelsea
Humic Acid
•
In the second phase of this study, we used the stochastic generator of
chemical structures (SIGNATURE) to generate all the 3-D structural
models of Chelsea HA that are consistent with:
– the experimental data, and
– The hypothesized formation pathway of Chelsea HA
•
The computer assisted structure elucidation program (SIGNATURE)
performs three basic tasks:
–
First, it calculates an exhaustive and non-overlapping list of molecular
fragments and associated interfragment bonds that best match the structural
input data for the humic acid (HA) of interest
– In the second task, the software evaluates the total number of structural models
that are consistent with the list of molecular fragments and interfragment bonds
found in step 1
– Finally, SIGNATURE generates all the 3-D models of the HA of interest or a
statistically representative sample of these models by randomly connecting the
“precursor molecules” and interfragment bonds found in step 1
SIGNATURE Input Parameters for Chelsea Humic Acid:
Atomic Signatures
Atom Type
C
H
Osp3
Osp2
Nsp3
Ssp3
Aliphatic C
Aromatic C
Methyl C
 C amino acid
Anomeric sugar C
Carbonyl + carboxyl C
O substituted aromatic C
Methoxy aromatic C
CA Hexose sugar C
CB Hexose sugar C
CC Hexose sugar C
CE Hexose sugar C
CF Hexose sugar C
Signature
h
s (S)exp
h_
o_
o’
n
s
c_
cp
c_(h_h_h_*_)
c_(n_c_h_*_)
c_(o_c_o_h_)
c=(o'*_*_*_)
cp(cpcpo_*_)
o_(cp(cpcp*_)c_(h_h_h_)*_*_)
c_(o_(h_*_*_)c_(c_o_h_)o_(c_*_*_)h_(*_*_*_))
c_(o_(h_*_*_)c_(c_o_h_)c_(o_o_h_)h_(*_*_*_))
c_(o_(h_*_*_)c_(c_o_h_)c_(c_o_h_)h_(*_*_*_))
c_(c_(o_h_h_)c_(c_o_h_)o_(c_*_*_)h_(*_*_*_))
c_(o_(h_*_*_)c_(c_o_h_)h_(*_*_*_)h_(*_*_*_))
93.40
43.50
14.50
6.90
0.66
17.00
34.00
3.00
3.00
2.00
24.00
9.00
6.00
2.00
2.00
3.19
2.00
2.00
SIGNATURE Input Paramters for Chelsea Humic Acid:
Molecular Fragments and Interfragment Bonds
Lignin Derived Fragments
Amino
Acids
Polyphenols
Sugars
Fatty
Acids
Bonds
1-(4-Hydroxy-3,5dimethoxyphenyl) ethanol
1-(3,4-Dimethoxyphenyl)
ethanol
3,4,5-Trimethoxy
cinnamic
acid
1-(4-Hydroxyphenyl) ethanol
Aspartic acid
Galacturon
ic acid
Gluconic
acid
Mannuroni
c acid
Allose
Arginine
3,4-Dimethoxy benzoic acid
Glutamine
4-Methoxy cinnamic acid
Glycine
4-Hydroxy benzoic acid
Histidine
Apocynol
Isoleucine
Cinnamyl alcohol
Coniferyl alcohol
Dihydroferulic acid
Dihydrocoumaric acid
Eugenol
Ferulic acid
Gallic acid
Guaiacol
Guaiacyl propionic acid
Isoeugenol
Protocatechuic acid
Sinapyl alcohol
Sinapinic acid
Syringyl alcohol
Syringic acid
Syringol
Syringyl propionic acid
Vanylic acid
Veratric acid
Vinyl guaiacol
Cis-Ferulic acid
p-Anisic acid
p-Coumaric acid
p-Coumaryl alcohol
Leucine
Lysine
Methionine
Phenylalanine
Proline
Serine
Threonine
Trytophan
Tyrosine
Valine
Undecanoic
acid
Dodecanoic
acid
Tridecanoic
acid
Tetradecanoic
acid
Pentadecanoic
acid
Hexadecanoic
acid
Heptadecanoic
acid
Octadecanoic
acid
Nonadecanoic
acid
Eicosanoic
acid
Ethanoic acid
Propanioc acid
Butanoic acid
Pentanoic acid
Hexanoic acid
Heptanoic acid
Octanoic acid
Nonanoic acid
Decanoic acid
Caro_Caro
3,4-Dimethoxy
cinnamyl
alcohol
3,4-Dimethoxy cinnamic acid
1,2,3 Trihydrox
benzoic acid
2,3,4 Trihydrox
benzoic acid
2,3,6 Tricarboxy
phenol
2,4
Dicarboxy
phenol
2,4
Dihydroxy
benzoic acid
3.4,5 Trihydroxy
benzoic acid
3.4
Dihydroxy
benzoic acid
3.5
Dihydroxy
benzoic acid
3
Hydroxy
benzoic acid
4
Hydroxy
benzoic acid
Phenol
o-Creosol
m-Creosol
p-Creosol
Phloroglucinol
Resorcinol
Glutamic acid
Alanine
Asparagine
Cysteine
Arabinose
Fucose
Galactose
Glucose
Gulose
Idose
Mannose
Rhamnose
Ribose
Xylose
Caro_H
Caro_O
Caro_N
Cali_ Caro
Cali_H
Cali_O
Cali_N
SIGNATURE Output Parameters for Chelsea Humic Acid:
Model Predictions for Atomic Ratios versus Analytical Input Data
Signature
h
s (S)exp - h s (S)pred
h_
o_
o’
n
s
c_
cp
c_(h_h_h_*_)
c_(n_c_h_*_)
c_(o_c_o_h_)
c=(o'*_*_*_)
cp(cpcpo_*_)
o_(cp(cpcp*_)c_(h_h_h_)*_*_)
c_(o_(h_*_*_)c_(c_o_h_)o_(c_*_*_)h_(*_*_*_))
c_(o_(h_*_*_)c_(c_o_h_)c_(o_o_h_)h_(*_*_*_))
c_(o_(h_*_*_)c_(c_o_h_)c_(c_o_h_)h_(*_*_*_))
c_(c_(o_h_h_)c_(c_o_h_)o_(c_*_*_)h_(*_*_*_))
c_(o_(h_*_*_)c_(c_o_h_)h_(*_*_*_)h_(*_*_*_))
Average atomic signature error
2.20
5.70
1.10
4.70
1.60
9.70
19.30
1.40
0.80
0.20
8.40
2.10
0.70
0.20
0.20
0.90
0.20
0.20
3.00
Evaluation of 3-D Structural Properties from
Atomistic Simulations
•
•
In the third phase of this study, we used SIGNATURE to generate all the
18 3-D structural models model isomers for Chelsea soil HA by randomly
connecting the optimal “precursor molecules” and corresponding
interfragment bonds found during the first stage of the model building
process.
The bulk density and solubility parameters of the Chelsea model isomers
were subsequently calculated using standard and validated methods of
computational chemistry (e.g., molecular mechanics and molecular
dynamics simulations)
Calculated bulk densities:(A) and
Hildebrand solubility parameters (B) of the SIGNATURE
generated 3- D model for Chelsea soil humic acid.
Estimated Bulk Density of Humic Substances:
3
1.20 -1.45 g/cm
3
Bulk Density (g/cm )
2.0
1.5
1.0
0.5
0.0
0
5
10
15
Solubility Parameter (J
1/2
3/2
/ cm )
Chelsea Soil Humic Acid Model Isomer #
40
35
Estimated Solubility Parameter of Soil Humic Acids:
1/2
3/2
23.0-28.0 J / cm
30
25
20
15
2
4
6
8
10
12
14
Chelsea Soil Humic Acid Model Isomer #
16
18
Selected SIGNATURE Generated Structural Models for Chelsea
Humic Acid
Chelsea soil humic acid model # 4
 = 1.33 g/cm3 and  = 27.80 J1/2 /cm3/2
Chelsea soil humic acid model # 6
 = 1.40 g/cm3 and  = 25.50 J1/2 /cm3/2
Chelsea soil humic acid model # 9
 = 1.43 g/cm3 and  = 28.40 J1/2 /cm3/2
Chelsea soil humic acid model # 5
 = 1.40 g/cm3 and  = 27.80 J1/2 /cm3/2
Chelsea soil humic acid model # 8
 = 1.42 g/cm3 and  = 28.00 J1/2 /cm3/2
Summary and Conclusions
•
We have combined experimental
characterization (elemental
analysis, FT-IR spectroscopy, 1-D
and 2-D 1H/13C NMR
spectroscopy and electrospray
ionization quadrupole time-offlight mass spectrometry) with
computer assisted structure
elucidation and atomistic
simulations to generate all the 3D structural models for Chelsea
soil humic acid that are
consistent with the structural
data and available bulk
thermodynamic properties of
humic acids.
•
We find that Chelsea soil humic
acid can be described as a
“simple” mixture of a limited
number of low molar mass
“molecularly
heterogeneous”
model isomers. The simulated 13C
NMR spectrum of a mixture of
these model isomers compares
very well with the measured
spectrum of Chelsea soil humic
acid.
Potential Impacts of Methodology in Humic Substances
Research
•
For HAs formed predominantly through the biotic/abiotic condensation of plant degradation
products (e.g., lignin degradation products, carbohydrates, amino acids, etc) such as those
found in Histosol, Mollisol and peat spoils (1), a systematic application of our methodology to
bulk HA samples and well resolved HA fractions from these soils is expected to result in the
development of reliable 3-D structural models.
•
Such models could then be used in subsequent integrated experimental and computational
studies to address some key fundamental questions:
1.
What are the 3-D structures of HAs in the bulk phase, aqueous solutions and at mineralwater interfaces?
2.
Do organic geo-macromolecules such HAs with no well defined head and tail self assemble
in ordered micelle/membrane like aggregates or disordered fractal like aggregates in aqueous
solutions and at mineral-water interfaces ?
3.
What are the “molecular” scale locations and preferred coordination environment for
metal ions and organic guests bound to HA hosts in aqueous solutions, in soils and at mineralwater interfaces?
4. How strong is the binding of metal ion and organic guests to HA hosts in aqueous solutions,
in soils and at mineral-water interfaces?
•
•
•
•
Acknowledgments
•Prof. Weilin Huang (Drexel University ) for providing
the Chelsea HA samples
•USEPA GLMA Center for Hazardous Substance
Research (Funding to Howard University )
•Department of Commerce (Funding to Howard
University and Caltech)
•National Science Foundation (Funding to The Ohio
State Environmental Molecular Science Institute)
•Environmental Molecular Sciences Laboratory (PNNL)
for analytical support)
Download