Front page for deliverables Project no. 003956 Project acronym NOMIRACLE Project title Novel Methods for Integrated Risk Assessment of Cumulative Stressors in Europe Instrument IP Thematic Priority 1.1.6.3, ‘Global Change and Ecosystems’ Topic VII.1.1.a, ‘Development of risk assessment methodologies’ Deliverable reference number and title: D.2.3.12 Report on first prototype of model for predicting biodegradation pathways in soil Due date of deliverable: 31 January, 2007 Actual submission date: 23 January, 2008 Start date of project: 1 November 2004 Duration: 5 years Organisation name of lead contractor for this deliverable: LMC Revision [draft, 1, 2, …]: Final Report Project co-funded by the European Commission within the Sixth Framework Programme (2002-2006) Dissemination Level PU PP RE CO Public Restricted to other programme participants (including the Commission Services) Restricted to a group specified by the consortium (including the Commission Services) Confidential, only for members of the consortium (including the Commission Services) X D. 2.3.12 Authors and their organisation: Gergana Dimitrova, Sabcho Dimitrov, Ovanes Mekenyan Laboratory of Mathematical Chemistry, Bourgas As. Zlatarov University, Bulgaria (Partner 35) Deliverable no: Nature: D. 2.3.12. Report Status: Final Report Reviewed by (period and name): February 4-18, 2008, Ralph Kühne Dissemination level: PU Date of delivery: 21 February 2008 Date of publishing: February 2008 Page 2/17 D. 2.3.12 Page 3/17 Contents Page 1. Introduction 4 2. Materials and methods 4 2.1. Observed metabolic pathways in soil 4 2.2. Simulation of metabolism 5 2.3. Similarity of metabolic pathways 5 3. Results 8 4. Summary 16 5. References 17 D. 2.3.12 Page 4/17 1. Introduction The aim of this study is to build a prototype of simulator for terrestrial biodegradation using the database with observed biodegradation pathways in soil. The modeling methodology is based on multipathway probabilistic approach for simulation of metabolism. The core of the approach is a library of hierarchically ordered individual transformations (catabolic steps) and matching substructure engine providing their subsequent performance. A collection of biodegradation pathways for 183 chemicals in soil were used to adapt the currently available simulator for terrestrial environment. The quality of simulator was quantified by the degree of reproducibility between observed and generated metabolites. 2. Materials and methods 2.1. Observed metabolic pathways in soil An electronic database with documented biodegradation pathways in soil were collected from different sources including monographs, scientific articles and public web sites. The most significant sources for these data were [1-3]. The created database includes observed biodegradation pathways for 183 chemicals, mainly pesticides. Amongst them there are herbicides, insecticides, fungicides, acaricides etc. The latter includes substances with a variety of chemical functionality, such as acid amides, aniline and nitrobenzenes, dithio and thiolcarbamates, five- and six-rings heterocyclic compounds, phenyl (aryl) carbamates, phosphoro(di)thiolates, sulfonylureas, etc. Additional information for microbial species, bacteria strain and presence or absence of oxygen was also implemented into the database. The catabolic D. 2.3.12 Page 5/17 pathways database for soil was used to extract new transformations and to adapt the currently available simulator for terrestrial environment. 2.2. Simulation of metabolism The modelling methodology is based on the probabilistic approach. Catabolism is simulated via the principal molecular transformations extracted from the metabolic pathway database. The molecular transformation consists of parent sub-molecular fragment, transformation products and inhibitory functional groups (masks). The latter serves as a reaction inhibitor. If the fragment assigned as a mask is attached to the target sub-fragment the execution of the transformation on parent chemical is prevent. Recently, a multi-pathway approach for simulating molecular transformations of chemicals in biotic and abiotic conditions was proposed [4-6]. The development of this approach was conditioned by the fact that along the most probable pathway chemicals could be metabolized by a number of less probable pathways. Initially, the parent chemical is submitted to the list of transformations and all transformations meeting the associated sub-structures are implemented on the parent producing the list of the first level metabolites. Each of the generated metabolites is then submitted to the same list of transformations to produce the second level of metabolites, etc. The hierarchy of the transformations and their probabilities are used to control the propagation of the catabolic maps of the chemicals. If the probability to obtain certain metabolite is less than a critical value (0.001 in the current version of the simulator) the propagation of metabolism is stopped and this metabolite is an end product. 2.3. Comparison of metabolic pathways D. 2.3.12 Page 6/17 An important characteristic of the metabolism simulator is its ability to correctly reproduce the observed metabolic maps. The predictability of the simulator is estimated on the basis of the union of the observed and predicted metabolic maps as is illustrated in Figure 1. + Observed catabolism = Predicted catabolism Union of pathways Figure 1. Predicted and observed catabolism and their union. In order to quantify the metabolism predictability of the model we distinguished four types of nodes in the observed and simulated maps as follows: - Root of the tree (parent chemical), - Observed and predicted metabolites; - Observed and not predicted metabolites (false negatives); - Predicted and not observed metabolites (false positives); SObs SPred , where SObs and SPred are the sets of observed and predicted metabolites. S Obs \ S Pred S Pred \ SObs . , D. 2.3.12 Page 7/17 Based on the union of metabolic trees the following statistics can be applied for evaluating the performance of the simulator: Probability that the metabolite is truly observed, given that the metabolite is predicted (predictability): Pred Card S Obs S Pred Card S Pred (1) Probability that the metabolite is predicted, given that the metabolite is truly observed (sensitivity): Sens Card S Obs S Pred Card S Obs (2) Probability that the metabolite is not predicted given that the metabolite is truly observed (false negatives): Fneg Card SObs \ S Pred Card SObs 1 Sens (3) Probability that the metabolite is predicted given that the metabolite is truly not observed (false positives): D. 2.3.12 Fpos Card S Pred \ SObs Card S Pred Page 8/17 1 Pr ed (4) In equations (2 - 4) parent chemical is not taken into account. 3. Results Collected data with documented biodegradation pathways in soil for about 180 chemicals was used to adapt the currently available simulator for biodegradation in water. All statistics presented below are calculated on the bases of these training chemicals. The number of the transformations in the old version of the simulator was 609. To evaluate its performance we have used the collected data set with 183 observed biodegradability pathways, using as a measure for simulator performance the sensitivity - probability that the metabolite is predicted, given that the metabolite is truly observed and the predictability - probability that the metabolite is truly observed, given that the metabolite is predicted. The results with the simulator developed to predict metabolism in water showed very low sensitivity and predictability – 57% and 45%, respectively. In order to improve the performance of the simulator new reactions were included and some specific masks were added to the existed transformations. As a result of these modifications the number of the transformations increased up to 689. Table 1 illustrates some of the new entered reactions and their inhibiting masks. D. 2.3.12 Page 9/17 Table 1. List with new entered reactions and their inhibiting masks. # Principal transformations 1 Aromatic ring cleavage Masks Probability 1.00 OH HO C O C{sp3} C{sp3} O OH C HO C O C C O C 2 Nitrile and amide hydrolysis CH CH2 C HO 3 N C 0.8143 CH O C H2C NH C Aromatic ring oxidation 0.1157 SH OH H CH C C Hal Hal Hal = F, Cl, Br, I 4 Oxidative O-dealkylation 0.0100 D. 2.3.12 N O N CH3 N 5 N OH Oxidative C - S bond cleavage C 0.0100 O O S C C HO O S + C O OH O Imidic – Amide tautomerization N C 0.0100 HN C 7 H3C N O 6 OH + N Page 10/17 OH C N C O N Aromatic ring oxidation 0.0100 O N N N O HO H N O C O 8 Aliphatic sulphur oxidation 0.0100 C C N N C C S N CH 9 O Oxidative N-dealkylation S N CH 0.0100 D. 2.3.12 C C N N Page 11/17 C H NH C N + HO C H C 10 Nitrogen formylation 0.0000 HC C NH2 C O O CH3 O NH N O The improvement in the simulator performance is presented in Figure 2. As can be seen from the figure the overall sensitivity increased up to 72% in comparison with the version for biodegradation in water (57%). Sencor (A) Water system (B) Water and soil compartment Figure 2. Comparison of predictability of biodegradation simulators for water (A) and water&soil (B) compartment. D. 2.3.12 Page 12/17 Sencor is an example of a chemical which sensitivity increased from 0% to 100% as a result of the modifications of the model. As can be seen from Figure 2 Sencor, predicted with the biodegradation simulator for water, belongs to the bar with structures having sensitivity values between 0% and 10%. The same chemical predicted with the new simulator for biodegradation in soil achieves the sensitivity value of 100% as a result of reproduction of all documented metabolites. The observed and predicted biodegradation pathways in soil for this chemical are presented in Figure 3A and 3B, respectively. As can be seen from Figure 3 the modified simulator generates correctly all metabolites from the observed biodegradation pathway. D. 2.3.12 (A) Observed metabolism Page 13/17 (B) Simulated metabolism Figure 3. Observed (A) and simulated (B) biodegradation pathway in soil for Sencor; green – observed and predicted metabolites, yellow – not observed but predicted intermediates, grey – predicted metabolites not accounted for similarity. D. 2.3.12 Page 14/17 Although the number of chemicals from the bars with low sensitivity presented in Figure 2 decreased significantly with the modified simulator there are still six structures for which the sensitivity remains 0%. Among these chemicals are Chloroanisidine, Thiophanate methyl, Clethodim, Tetradifon, Alloxydim sodium, Tebufenozide. The reasons for their low predictability could be summarized as follows: 1) incomplete documented biodegradability pathways with missing intermediates and 2) software limitation resulting in impossibility at this stage to reproduce some of the transformations, for example – reactions of condensation. These problems impose the necessity of collecting new biodegradation information in soil for the problematic chemicals and implementation of a new logic of metabolism in the software in order to reproduce the documented pathways. In Figure 4 as an example is shown one part of the documented biodegradation pathway for Chloroanisidine for which the simulator is unable to reproduce the formation of a dimmer. The structures indicated with red represent the observed but not predicted metabolites. D. 2.3.12 Page 15/17 Figure 4. Part of the documented biodegradation pathway for Chloroanisidine, including the formation of dimmer. An example of a chemical with incomplete observed metabolic pathway is illustrated in Figure 5. The biodegradation pathway for Alloxydim sodium does not present the full catabolism of the chemical and can not be used for the extraction of a reasonable transformation. D. 2.3.12 Page 16/17 Figure 5. An example of a chemical with incomplete observed metabolic pathway. 4. Summary The work presented here describes the first prototype of the model for predicting biodegradation pathways in soil. Collected data with observed biodegradation pathways in soil for 183 chemicals were used to adapt the CATABOL model to simulate their catabolism. 80 new reactions were included to the currently available simulator for biodegradation in water. The quality of the simulator was quantified by the degree of reproducibility between observed and generated metabolites. The results showed significant improvement of the sensitivity that increased from 57% to 72%. D. 2.3.12 Page 17/17 5. References 1. Aizawa, H., Metabolic Maps of Pesticides: Ecotoxicology and Environmental Quality Series, Academic Press, New York, 1982. 2. Aizawa, H., Metabolic Maps: Pesticides, Environmentally Relevant Molecules and Biologically Active Molecules , Academic Press, San Diego, 2001. 3. Pesticides: Benefaction or Pandora 's Box? A synopsis of the environmental aspects of 243 pesticides, J. B. H. J. Linders, J. W. Jansma, B. J. W. G. Mensink, K. Otermann, Report no 679101014, March 1994. 4. O.G. Mekenyan, S.D. Dimitrov, T.S. Pavlov, G.D. Veith. Curr. Pharmaceut. Design, 10, 1273 (2004). 5. O.G. Mekenyan, S. Dimitrov, R. Serafimova, E. Tompson, S. Kotov, N. Dimitrova, J. Walker. Chem. Res. Toxicol., 17, 753 (2004). 6. S.D. Dimitrov, T.S. Pavlov, R. Vasilev, O. Mekenyan. Simulation of abiotic transformations by CATABOL, poster presented at SETAC Europe 15th Annual Meeting, Lille, France, 21-26 May (2005).