EXAMPLES OF CURRENTLY USED (Q)SAR-MODELS IN DANISH EPA Endpoint Reference Source Bintein and Devillers, SAR and QSAR in Environmental Research, Vol. 1, pp. 29-39 (1993): Bintein et al. log BCF = 0.910 log P - 1.975 log (6.8 10-7 P + 1) - 0.786 r=0.950 1993 s=0.347 F=463.51 Multicase model. Data from MITI report; "Biodegradation and Bioaccumulation Data of Existing Chemicals based on the CSCL Japan", Biodegradation edited by Chemicals Inspection & Testing Institute, Japan Oct. 1992, II (DK) DK-EPA Japan Chemical Industry Ecology - Tox. & Info-Center, and supplemented with extra data from Phil Howard, Syracuse, for readily biodegradable substances. Multicase model. Data from Brooke, L.T. et al: "Acute Toxicities of Organic Fish, LC50 Chemicals to Fathead Minnows (Pimephales promelas)", Center for Lake DK-EPA (96h) Superior Environm. Studies, University of Wisconsin, Superior 1988. Multicase model. Data from Aquire Database, Kuhn (Water Res. 23/1989), Daphnia, LC50 Joop, DK-EPA tests and O.C.Hansen, DTI Environment, "Quantitative DK-EPA (48h) Structure-Activity Relationships (QSAR) and Pesticides", April 1999 Multicase model. Ongoing testing on the Danish Technical University Algae, EC50 (DTU) on Pseudokirchneriella subcapitata (Selenastrum capricornutum), DK-EPA (growth) plus data from various sources in the open literature Tetrahymena Multicase model. Data generated by Dr. Terry Shultz, Univ. Tennessee pyriformis (46h, DK-EPA (semi-confidential data set) IGC50) BCF (BIntein) LD50 (Rat, acute oral) LOAEL (Rat, Chronic) Skin Irritation (Severe vs. mild) Sensitization I (dermal) – strong vs. Mild Sensitization II TOPKAT model Commercial TOPKAT model Commercial Multicase model. Data from RTECS nov. 2000, HSDB nov. 2000, and 22. Amendment of EU Annex I Training Validation Set Domain* n=154 n=745 LGO (10%) gave: Sens.=77%, Spec.=76%, Conc.=77% 60% n=569 LGO (10%) gave: q2 = 0.76 47% n=641 LGO (10%) gave: q2=0.69 53% n=476 LGO (50%) gave q2=0.77 39% n=1860 LGO (50%) gave q2=0.73 51% LOO of the 19 sub-models gave 86-100% of estimates falling within a factor of five from test n=Approx. results. DK-EPA external validation with 1840 4000 substances gave q2=0.31 and 86% estimated to be within a factor of ten from test results. LOO of the 5 sub-models gave 95% of estimates n=393 falling within a factor of 3-5 from test results. 66% 61% DK-EPA n=800 LGO (20%) gave: Sens.= 59.7%, Spec. = 90.5%, Conc.= 77.4% 50% TOPKAT model, GPMT data from open literature Commercial n=389 LOO gave: Sens.=85-94%, Spec.=87-96% 36% TOPKAT model, GPMT data from open literature Commercial n=390 LOO gave: Sens.=88-96%, Spec.=88-98% 42% Multicase model A33, data from GPMT or human experience (ACD Commercial n=1033 LGO (10%) gave: Sens.=69-89%, Spec.=89- 58% Endpoint Reference Source (dermal) Humans) Respiratory Multicase model. Data from Graham et. al; Regul. Toxicol. and Pharmacol. DK-EPA allergy 26; 296-306 (1997) Reproductive TOPKAT model, data from open literature (rat oral data) Commercial toxicity Teratogenicity Multicase model A49, data from FDA-TERIS program Commercial (Human) Estrogen-a Receptor Multicase model. Data from Japanese METI test data (6th Meeting of the Binding Assay - Task Force on Endocrine Disrupters Testing and Assessment (EDTA) 24DK-EPA balanced 25 June 2002, Tokyo, Appendix I) model Estrogen reporter gene Multicase model. Preliminary assessment of Human estrogen-a Reporter Gene model based on Japanese METI test data (6th Meeting of the Task Force on Endocrine Disrupters Testing and Assessment (EDTA) 24-25 June 2002, Tokyo,Appendix I) after introduction of new structural notations. DK-EPA Multicase model. Data from the open literature plus approximately 200 test Antiandrogene results funded by DK EPA and made at the Danish Institute for Veterinary DK-EPA effect and Food Research in 2003/2004. Arh Receptor Multicase model. Rannug et. Al, Carcinogenesis vol. 12, no. 11, pp. 2007Commercial Binding 2015, 1991. Training Validation Set 94%, Conc.=82-89% LGO (10%) gave: Sens.=71.4%, Spec.= 90.0%, n=79 Conc.=85.2% LOO of the 3 sub-models gave: Sens.=86-89%, n=273 Spec.=86-97% LGO (10%) gave: Sens.=60-100%, Spec.=75n=323 92.9%, Conc.=70-88.9% 26% 37% 43% n=803 LGO (50%) gave: Sens. = 77%, Spec. = 87%, Conc. = 82% 52% n=899 LGO (50%) gave: Sens. = 73%, Spec. = 87% and Conc. = 81.4%. External validation for Specificity: 418 Reporter Gene assay chemicals which tested inactive, and which were excluded prior to modelling were predicted and gave the following results: Of 384 Active or Inactive predictions, 336/384 or 88% were correctly predicted as Negative - Of the 303 chemicals which were fully within the AOK model domain, 274/303 or 90% were predicted negative. 61% n=340 LGO (50%) gave: Sens. = 64%, Spec. = 90% and Conc. = 78%. 31% n=148 Samuel H. Yalkowsky, “A New Absorption Parameter”, QSAR 2002 (10th G.I. absorption International Workshop on QSARs in Environmental Science, May 2002, Ottawa, Ontario, Canada) Conference paper n=132 Structural alerts for DNA Multicase model A2E, data from Ashby and associates reactivity (Ashby) Commercial n=784 Ames TOPKAT model, Mutagenicity (Ames) mutagenicity I Commercial n=1866 Ames Commercial n=2034 Multicase model A2H, data from NTP (NIEHS-USA) or GENETOX (EPA) Domain* LGO (10%) gave: Sens. = 80-100%, Spec. = 71100% and Conc. = 75-100%. External validation with 98 chemicals gave: 96% of well absorbed substances correctly predicted and 70% of poorly absorbed substances correctly predicted. Overall concordance 89%. LGO (10%) gave: Sens.=84.6-97.6% Spec.= 6069.2%, Conc.= 80.6-87.1% LOO result of the 10 sub-modules: Sens. and spec. of 75-100%. Ext. val. by DK-EPA (1998) with 118 chemicals (A: 50 and I: 68) gave: Sens.=76% and Spec.= 82% LGO (10%) out gave: Sens.=75-78.5%, 7% - 66% 61% 81% Endpoint Reference Source mutagenicity II Training Validation Set Spec.=78.2-90%, Conc.=78.3-83.2% DK-EPA LGO (50%) validation gave: Sens.= 72%, Spec.=89%, Conc.=80% and external validation with 53 CPDB chemicals (27 pos and 26 neg) gave Sens.=74%, Spec.=96% and Conc.=85% Domain* (Coverage for MCASE Ames or TOPKAT Ames: 90.1%) LGO (10%) gave: Sens.= 71%, Spec.=82%, Conc.= 78% LGO (50%) gave: Sens.= 57%, Spec.=87%, Conc.= 72% LGO (50%) gave: Sens.= 71%, Spec.=77%, Conc.= 75% LGO (50%) gave: Sens.= 63%, Spec.=79%, Conc.= 80% - Direct, -S9 Multicase model. Data from Proctor and Gamble DK-EPA n=396 - Base-pair Multicase model. Data from Proctor and Gamble DK-EPA n=207 - Frame-shift Multicase model. Data from Proctor and Gamble DK-EPA n=314 - Rev.>10*Ctrl Multicase model. Data from Proctor and Gamble DK-EPA n=189 Commercial n=233 LGO (10%) gave: Sens. = 44-80%, Spec. = 5080%, Conc. = 56-80%. 43% DK-EPA n=506 LGO (50%) gave: Sens. = 64%, Spec. = 83%, Conc. = 74%, LGO (10%) gave: Sens. = 63%, Spec. = 86%, Conc. = 76%. External validation with 62 chemicals within domain gave: Sens. = 59%, Spec. = 82%, Conc. = 76%. 63% Grant et al. 2001 n=555 LGO (50%) gave: Sens.= 70%, Spec. = 82% and Conc. = 77% 54% DK-EPA n=360 LGO (50%) gave: Sens. = 51%, Spec. = 84%, Conc. = 71% 61% Chromosomal aberrations Multicase model A61, data from NTP (tests in cultured CHO cells) (CHO) (NTP) Multicase model. Data from: Motoi Ishidata, Jr., "Data book of Chromosomal Chromosomal Aberration Test In Vitro", Biological Research Centre, aberrations National Institute of Hygienic Sciences, Japan, Elsevier Life-Science (CHL) Information Center, New Yourk 1988. ** 49% 35% 52% 43% Additional data from: Mutagenicity Database of Environmental Chemicals, as reported by Dr. Motoi Ishidate, Jr. Taken from Sofoni, T. (Ed), "Data Book of Chromosomal Aberration Test In Vitro", LIC. Tokyo, 1998. Mutations in Mouse Lymphoma (GTX) Multicase model. Data from Grant et. al., Mut. Res. 465 (2001) pp. 201229; Multicase model. Data from various sources in the open literature. The primary references are: Hayashi, et. al., "Micronucleus Tests in Mice on 39 Mouse Food Additives and Eight Miscellaneus Chemicals," Food Chem Toxicol, micronucleus II 26 (1988) 487-500 ; Mavournin, et. al., "The in vivo micronucleus assay in mammalian gone marrow and peripheral blood. A report of the U.S. Environmental Protection Agency Gene-Tox Program," Mutation Research, Endpoint Reference Source Training Validation Set Domain* 239 (1990) 29-80 ; Waters, et. al., "The performance of short-term test in identifying potential germ cell mutagens: A quantitative and Qualitative analysis," Mutation Research, 341 (1994) 109-131 ; Morita, et. al., "Evaluation of the rodent micronucleus assay in the screeing of IARC carcinogens (Groups 1, 2A and 2B) The summary report of the 6th collaborative study by CSGMT/JEMS - MMS," Mutation Research, 389 (1997) 3-122. Mouse Sister Chromatid Multicase model. Data from Genetox Mut. Res. 297 (1993) pp. 101-180 Exchange bone m. (in vivo) Rodent Multicase model. Data from Genetox Mut. Res. Vol. 154, No. 1, July 1985 Dominant plus updates Lethal Test Chinese Hamster Ovary Multicase model. Primary information source; Genetox (Mutat. Res. Vol. Cell HGPRT 196, No. 1, July 1988) plus updates Forward Mutation Assay Drosophila melanogaster Multicase model. Data from Mut. Res. Vol 123, No. 2, Okt. 2983 mainly Sex-Linked from Genetox plus updates Recessive Lethal (SLRL) Multicase model. Data from Sasaki et. al., Critical Reviews in Toxicology Mouse COMET Vol. 30, issue 6, 2000, pp. 629-799 plus 89 physiological chemicals assay entered as inactives Unscheduled DNA synthesis Multicase model. Data from Mutation Research, 221 (1989) 263-286, in Rat Williams et al., updated with tests from CCRIS and Toxline EMIC (Jan. 03) hepatocytes, in vitro Multicase model. Data from Isfort, et. al. Mutation Research 356 (1996), Syrian Hamster pp. 11-63 (169 test results), Kerckaert, et.al: Mutation Research 356 Embryo Cell (1996) pp. 65-84 (25 test results.) 7 additional substances from Toxicol. Transformation Sci. 2002 Jul; 68(1) pp. 43-50, Mutation Research 392, no. 1,2 (1997) pp. Assay 61-90 and Toxcol. Sci. 1998 (41(2) pp. 189-197. To balance the model against over representation with positive tests, 39 physiological chemicals which were assumed to have a low probability of activity were added as negatives (source Mutation Research 465 (2000) pp. 201-229). DK-EPA n=210 LGO (50%) gave: Sens=70% Spec. = 98% and Conc.=91% 51% DK-EPA n=193 LGO (50%) gave: Sens.= 44%, Spec.= 94%, Conc.= 76% 48% DK-EPA n=243 LGO (50%) gave: Sens.=76%, Spec.=87%, Conc.=80%. External validation with 150 physiological chemicals expected to be negative for this endpoint gave for all chemicals: Spec.=94.8% and gave for 77 AOK predictions: Spec. = 97.5% 41% DK-EPA n=368 LGO (50%) gave: Sens. = 67%, Spec. = 93%, Conc. = 81% 56% DK-EPA n=288 LGO (50%) gave: Sens.=65%, Spec.=89%, Conc.=80%. 48% n=413 LGO (50%) gave: Sens.=56.3%, Spec.=89.1%, Conc.=74.0%. External validation with 150 physiological chemicals expected to be negative for this endpoint gave: Spec.=96.7%. 60% DK-EPA n=363 DK-EPA LGO (50%) gave: Sens.=60%, Spec.=80%, Conc.=72.0%. External validation with 61 physiological chemicals expected to be negative for this endpoint gave: Spec.=58/61=95%. External validation with 16 positive tests not used in the model gave: Sens=11/16=69%. 55% Endpoint Reference Cancer (male Multicase model AF1. Matthews, J. and J.F. Contrera, Reg. Toxicol. And Rat, FDA) open Pharm. 28, 242-264 (1998) Cancer (female Multicase model AF1. Matthews, J. and J.F. Contrera, Reg. Toxicol. And Rat, FDA) open Pharm. 28, 242-264 (1998) Cancer (male Multicase model AF1. Matthews, J. and J.F. Contrera, Reg. Toxicol. And Mouse, FDA) Pharm. 28, 242-264 (1998) open Cancer (female Multicase model AF4. Matthews, J. and J.F. Contrera. Reg. Toxicol. And Mouse, FDA) Pharm. 28, 242-264 (1998) open. Cancer (male Rat, FDA) proprietary Source Commercial Commercial Training Validation Domain* Set LGO (20%) gave: Sens.=51-95 %, Spec.=54n=1085 70% 97%, Conc.=80-81%, Chi2=35-40 LGO (20%) gave: Sens.=64-67% Spec.=98-99%, n=1079 72% Conc.=88-89%, Chi2=61-62 Commercial n=1010 LGO (20%) gave: Sens.=64-68%, Spec.=9699%, Conc.=90-91%, Chi2=46-52 69% Commercial n=1025 LGO (20%) gave: Sens.=76-78%, Spec.=96%, Conc.=91-92&, Chi2=53-55 70% Multicase model AF5 Commercial - Cancer (female Rat, FDA) Multicase model AF6 proprietary Commercial - Cancer (male Mouse, FDA) proprietary Multicase model AF7 Commercial - Cancer (female Mouse, FDA) Multicase model AF8 proprietary Commercial - CPDB Rat TD50 Multicase model. Data from Louis & Gold, Carcinogenic Potency Data Base, 1999 Berkley (on-line version) DK-EPA n=868 CPDB Mouse TD50 Multicase model. Data from Louis & Gold, Carcinogenic Potency Data Base, 1999 Berkley (on-line version) DK-EPA n=720 DK-EPA n=320 HepatocarcinogMulticase model. Data from Gold,and Ames: Carcinogenic Potency Data External validation by Matthews and Contrera (Reg. Toxicol. And Pharm. 28, 242-264, 1996) with 100 test results not used in the model gave for the sum of the 4 proprietary models: Sens=34/58=58.6%, Specificity=41/42=97.6% and Conc.=75/100=75%. External validation by Matthews and Contrera (Reg. Toxicol. And Pharm. 28, 242-264, 1996) with 100 test results not used in the model gave for the sum of the 4 proprietary models: Sens=34/58=58.6%, Specificity=41/42=97.6% and Conc.=75/100=75%. External validation by Matthews and Contrera (Reg. Toxicol. And Pharm. 28, 242-264, 1996) with 100 test results not used in the model gave for the sum of the 4 proprietary models: Sens=34/58=58.6%, Specificity=41/42=97.6% and Conc.=75/100=75%. External validation by Matthews and Contrera (Reg. Toxicol. And Pharm. 28, 242-264, 1996) with 100 test results not used in the model gave for the sum of the 4 proprietary models: Sens=34/58=58.6%, Specificity=41/42=97.6% and Conc.=75/100=75%. LGO (50%) gave: Sens.=68%, Spec.=81%, Conc.=74%, and regression log measured vs. log calculated TD50 gave q2=0.43 LGO (50%) gave: Sens.=38%, Spec.=86%, Conc.=66%, and regression log measured vs. log calculated TD50 gave q2=0.22 LGO (50%) gave: Sens = 33%, Spec. = 91%, 68% 70% 67% 67% 69% 65% 61% Endpoint Reference enicity in Base, 1999 Berkley (on-line version) rodents (Rat or Mouse) Source Training Validation Set Conc. = 74%. External evaluation by 182 negative tests gave Spec. = 86.3%. Domain* Some of these models were externally validated, all were cross validated: LGO (Leave Groups Out) and LOO (Leave One Out) refers to statistical validations ("cross validations"), where one or more (e.g. 10%, 20% or 50%) of the substances are removed by random and a new model is developed on the remaining substances. (LOO used for TOPKAT models may give too "optimistic" results in some cases.). The new model is then used to predict the chemicals which were not used to make it. This is repeated a number of times. Predictions are compared with experimental test results to determine the predictivity of the model. For continuos variables (quantitative predictions), this is expressed as the regression coefficient for predictivity q2. For parametric ("yes/no") models Cooper statistics are used to express predictivity as sensitivity (sens.), specificity (spec.) and concordance (conc.). What is shown in the validation column is not internal performance measures, but predictivity assessed by external validation and cross validation, where chemicals are removed entirely and new models are made to predict the chemicals, which were not used to make them. Sensitivity is the fraction of the positives correctly identified by the model, specificity is the fraction of the negatives correctly identified by the model and concordance is the overall accuracy. Regarding commercial models: See for Topkat models http://www.accelrys.com/products/topkat/modules.html and for Multicase models http://www.multicase.com/. Please note that models from Episuite and the EU TGD are not referenced above but cf. http:/www.epi.gov and http://ecb.jrc.it/ * Domain for approximately 46000 EINECS substances. For further details on domain description: See www.mst.dk/chemi/01050000.htm. ** cf. also: ENV/JM/TG(2004)27/ANN p. 91-110, Annex 4 J. Niemelä & Wedebye E.:”A “Global” Multicase Model for in vitro Chromosomal Aberrations in Mammalian Cells”