Developing and Disseminating Hazard and Exposure Data: An Integrated Tiered Testing and Assessment Approach for a Screening Level Safety Assessment A. Background This guidance document has been developed to assist ACC members develop safety evaluations for high priority chemicals as they implement ACC’s Product Safety Code. The guidance is not structured as a step-by-step “how to” manual, but rather as a general framework that can be used to develop screening-level hazard information, screening-level exposure information, and how to integrate the information to derive a screening-level safety assessment. Companies should tailor their assessments to reflect relevant chemical properties, hazards and uses, and are free to follow alternative approaches for developing safety evaluations to meet the objectives of the new ACC Product Safety Code. B. Introduction Developments resulting from the U.S. High Production Volume Program (HPV) Challenge Program, REACH initiatives in Europe, and recent TSCA modernization discussions in the U.S. have prompted industry to: 1) Develop improved methods to evaluate the potential hazards and risks of chemicals; and 2) Provide chemical hazard, use, exposure and risk information to regulators and stakeholders, including the public. Hazard, use and exposure information is already being collected and provided to stakeholders by ACC members through the HPV/EHPV program and International Council of Chemical Associations (ICCA) Global Product Strategy (GPS) product safety summary documents. Consistent with ACC’s Board-approved advocacy for TSCA Modernization, additional enhancements to the U.S. chemical management program is envisioned. The approach outlined in this guidance builds on those Board-approved positions to provide a framework that each member company may consider using to meet its RC commitments to develop safety evaluations on its chemicals. C. Integrated Testing and Assessment is Superior to an Inflexible “Base Set” The concept of a one-size-fits-all “base set” of traditional toxicity tests for all substances is outdated. The recent evolution of toxicity testing methods toward the inclusion of in vitro and in silico approaches for hazard identification or screening has been based, in part, on the desire to reduce the use of animals in regulatory program-driven toxicity testing. These integrated testing strategies, which utilize relevant information from multiple sources, including predictive computer models, chemical categories, and in vitro assays are gaining greater acceptance as a way to provide estimates of toxicological properties in lieu of conducting in vivo animal toxicity tests. Integrated testing strategies have been incorporated into the U.S. HPV Challenge program, the Organization for Economic and Cooperative Development’s (OECD) HPV Programme, the EU REACH regulatory framework, and the ICCA’s GPS. Confidence in the validity of these hazard profiling methods is key to their application, and this means there must be adequate evidence for the relevance, reliability, sensitivity, and specificity of methods. It is 1 only with such evidence that regulatory agencies, the regulated community, and the public can be confident that the use of such data for decision making is protective of human health and the environment. Although various programs differ somewhat in how integrated testing is implemented, all share a common feature: an initial set of information on potential hazards and exposures is developed based on existing data which usually includes data from a few general toxicity tests, physical chemical properties, an understanding of uses, and science-based inference based on structure, class, functional groups, etc. Then, only if there is a legitimate need for greater confidence, or this initial set of information signals a concern, would additional specific toxicity tests (or exposure evaluations) be conducted. Signals for concern could include consideration of uses and exposures (e.g., widespread dispersive use or a high production volume chemical) or a functional group indicative of high toxicity. With this approach, resources are focused on specific complex tests that are most important for hazard characterization, while minimizing the use of animals. By using hazard and exposure profiling tools and models, and arraying toxicity tests in a tiered framework, appropriate triggers can be used to assist in determining when (and what specific) additional testing or evaluation are needed. This integrated testing and assessment approach provides the necessary degree of scientific rigor for product stewardship, while preserving flexibility to account for differing chemical toxicities and to address specific concerns associated with existing and anticipated exposures to specific chemicals. It should be noted that U.S. regulatory agencies are moving toward integrated testing approaches in almost all areas where human health and environmental effects are being evaluated. D. Developing a Screening Level Safety Evaluation The first step for a company would be to initiate a prioritization process in order to categorize the substances in its product line into high, medium, and low concern substances. ACC has developed such a prioritization tool and it is represents one method that can be employed by individual companies.1 . Once prioritized, the next step would be to employ a tiered approach, such as the ICCA Global Product Strategy: ICCA Guidance on Chemical Risk Assessment (ICCA-GRA) framework2 or that described by Plunkett et al., 20103 for gathering and evaluating available substance-related toxicity, use and exposure data. For each ICCA-GRA priority level, the ICCA- GRA framework provides guidance on assembling the initial set of information needed for conducting a screening level safety assessment. Chemicals with higher hazard and/or exposure potential require more information at the outset as a starting point for their safety assessment than do chemicals with lower hazard or exposure potentials. The typical output of the screening level safety assessment for human health is a Margin of Exposure (MOE), which is the ratio of the 1 http://www.americanchemistry.com/Prioritization-Document. August 29, 2011. http://www.americanchemistry.com/Prioritization-Document 2 ICCA “Global Product Strategy: ICCA Guidance on Chemical Risk Assessment” 2nd Edition, xxxx, 2011 3 Regulatory Pharmacology and Toxicology 58:382-394. 2 estimated level of exposure to the Human Toxicity hazard endpoint or Point of Departure (POD)4. A Benchmark MOE is typically established by taking into account factors such as extrapolation from lab animal toxicity studies to humans, extrapolation from the average human to a sensitive human and the quality and completeness of the overall toxicity data set. Typically, if the Human POD is based on straight extrapolation from an animal toxicity study NOAEL, the Benchmark MOE is set at 100. However, if allometric scaling5 based on BW3/4 is used to derive the Human POD, the Benchmark MOE of 30 is common.6 The Human Toxicity POD is typically derived from a No Observed Adverse Effect Level (NOAEL) or equivalent metric like a Benchmark Dose (BMD) of Benchmark Dose Lower Limit (BMDL) based upon consideration of the toxicity information for all the relevant endpoints and tests. The Human Toxicity POD may also include an assessment of the uncertainty associated with the available toxicity information. The input from an experienced toxicologist or risk assessor should be considered when evaluating the toxicity studies for determining NOAELs, BMDs, PODs and Benchmark MOEs. Different data sets, routes of exposure and study types/designs can dictate the use of different methods. For example, EPA‘s Bench Mark Dose software7 includes models to address quantal data, continuous data, and nested developmental toxicology data. For a safety assessment, when the ratio of the estimated level of exposure to the Human Toxicity POD is sufficiently large (e.g., MOE > 100 for a NOAEL- derived Human POD or MOE > 30 for an allometrically-scaled Human POD) no additional action would generally be required. If the ratio is less than appropriate Benchmark MOE, or if a particular effect observed in a toxicity study8 raises concern, then additional hazard characterization studies, a refined exposure assessment, or enhanced product stewardship measures should be considered. For some substances, there may be a need to estimate potential carcinogenic risks in addition to non-cancer risks. The approach employed will depend on an understanding of the mode of action.9 EPA guidance (Ibid.) and experienced risk assessment practitioners should be 4 The POD is often the lower bound on dose for a change in response level from a dose-response model (BMD), or a NOAEL or LOAEL for a change in level of response. 5 EPA (2011). Final recommended use of body weight3/4 as the default method in derivation of the oral reference dose. US Environmental Protection Agency. http://www.epa.gov/raf/publications/pdfs/recommended-use-ofbw34.pdf. 6 EPA (2012). TSCA Workplan Chemical Risk Assessment N-Methylpyrrolidone: Paint Stripping Use. http://www.epa.gov/oppt/existingchemicals/pubs/TSCA_Workplan_Chemical_Risk_Assessment_of_NMP.pdf 7 Benchmark Dose Software (accessed January 31, 2012). http://www.epa.gov/ncea/bmds/index.html 8 Ibid. In conducting such evaluations, consideration of the biologically-based endpoint-specific toxicity triggers described in Plunkett et al., 2010 (Regulatory Pharmacology and Toxicology 58:382-394) may be helpful. 9 EPA (2005). Guidelines for Carcinogen Risk Assessment. Washington, DC: Risk Assessment Forum, U.S. Environmental Protection Agency; March 2005, “The term ‘mode of action’ is defined as a sequence of key events and processes, starting with interaction of an agent with a cell, proceeding through operational and anatomical changes, and resulting in cancer formation. A ‘key event’ is an empirically observable precursor step that is itself a necessary element of the mode of action or is a biologically based marker for such an element. Mode of action is contrasted with ‘mechanism of action,’ which implies a more detailed understanding and description of events, often at the molecular level, than is meant by mode of action. The toxicokinetic processes that lead to formation or 3 consulted for determining the mode of action and selecting the methodology to extrapolate from animal studies at high doses to estimate potential carcinogenic risks to humans at environmentally relevant exposures. A Margin of Exposure approach can be used for carcinogenic substances that act via a non-linear, threshold mode of action. For other carcinogenic substances, the toxicity profile and mode of action may indicate consideration of a linear low dose extrapolation method. In some cases, both the threshold and the linear approaches may be warranted to reflect the uncertainty of current understanding. All assessments should include the rationales for selecting the mode of action(s) and the use of the specific method for quantifying potential carcinogenic risks to humans at environmentally relevant exposures. For environmental and ecotoxicological safety evaluations, it is recommended that the ICCAGRA decision-tree approach to developing hazard, use and exposure information be considered. However, additional discussion by companies on this approach is probably warranted to assess how effectively it has worked in practice. 1. Detailed Discussion on “Base Set” To sufficiently characterize hazards and risks of chemical substances along the value chain, it is necessary to have an appropriate set of information on both potential hazards and anticipated exposures. Recognizing this, ICCA-GRA uses a tiered, risk-based approach for evaluating potential risks to human health and as well as the environmental safety of chemicals and chemical products. Under the GPS, member companies, through the ICCA, have committed to gathering safety data for chemicals in commerce, preparing risk assessments and sharing information on the risks and benefits of those chemicals with the public in an understandable form. To implement the GPS, the ICCA developed the ICC-GRA which includes a decision tree framework, called the “Base Set of Information,” to define the initial information commitment that is considered sufficient for conducting risk assessment for most chemicals in commerce. The current ICCA-GRA framework defines four risk tiers (minimal, low, medium, and high potential) with each tier requiring increasing amounts of information related to the toxicology and ecotoxicology of the substance, based on the potential for human and/or ecological exposure and/or the hazard potential of a substance. There is some confusion around the term “base set” as used in the ICCA-GRA framework. Indeed, the term “base set” is a misnomer. Rather than viewing the ICCA-GRA as requiring a single, inflexible, one-size-fits-all “base set” of information, it is more appropriate to think of ICCA-GRA as a guided decision-tree process to assist in selecting the initial set of hazard information – the appropriate set of data and information commensurate to understand sufficiently the substance’s uses and exposure potential. distribution of the active agent to the target tissue are considered in estimating dose but are not part of the mode of action as the term is used here. There are many examples of possible modes of carcinogenic action, such as mutagenicity, mitogenesis, inhibition of cell death, cytotoxicity with reparative cell proliferation, and immune suppression.” 4 The ICCA-GRA base-set-decision-tree-process begins with the evaluation of all relevant existing information on hazard, use, and exposure of a substance. The ICCA-GRA guidance provides a decision tree process to assist when determining whether existing information is sufficient as an initial set of hazard data/information or if additional toxicity testing/ toxicity information gathering is warranted in order to provide confidence in product stewardship decisions to protect workers, the public, and the environment. The ICCA-GRA decision-tree procedure is risk-based and integrates information on both hazard and potential exposure. Both hazard and exposure are integral components of the ICCA-GRA decision-tree process. These two components, which guide decisions on the initial set of information required for product stewardship, assure consistency across chemicals, companies and regions. Depending on the degree of toxicity and the degree of exposure, the “base set” of information that is considered sufficient for conducting risk assessment may vary. In cases where hazards are greater and exposures are more wide-ranging, a more extensive “base set” of information will be needed. The ICCA-GRA decision-tree framework provides guidance for determining these health, safety, and environment information needs. The ICCA-GRA framework also provides guidance on conducting quantitative risk assessment. It includes an exposure section, which provides assistance for developing and using exposure scenarios and models to derive quantitative estimates of exposure. The ICCA-GRA framework calls for integrating such exposure estimates with NOAELs derived from toxicity studies to obtain a MOE. While the ICCA-GRA framework is scientifically sound in all respects, each company should carefully review the parameters and defaults, as many reflect guidance by the European Chemicals Agency (ECHA) developed for Europe and REACH, and these may not be appropriate or acceptable for use in other regions. 2. Screening Level Safety Evaluation of “High Priority” Substances For “high priority” substances, the initial set of hazard and exposure information for integrated testing and evaluation for human health is similar to that of the OECD Screening Information Data Set (SIDS). The ICCA-GRA specifies that data/information on the following endpoints/toxicities be evaluated: irritation (eye/skin), mutagenicity (e.g., Ames and mammalian cell in vitro, in vivo micronucleus ; only if positive in both in vitro tests); sensitization, repeated dose toxicity, and reproduction/developmental toxicity test. These encompass the ACC Board-approved information elements in the HPV Challenge Program. Additionally, while the HPV Challenge Program did not require irritation data, in many cases this data was included because the data was readily available. The ICCA-GRA also recommends evaluating sensitization, even though this was not included in the HPV Challenge Program endpoints. Further, companies should also consider toxicokinetics (e.g., ADME screening assessment) when evaluating “High Priority” substances (for discussion and approaches see Plunkett et al., 2010). The extent of this initial set of data/information is believed to be sufficient to support the screening level safety evaluation of high priority chemicals. It builds from the experience of ACC companies with the US HPV Challenge Program. The types of studies and endpoints 5 examined and reported under the HPV Challenge Program are directly relevant to evaluating human health risks, including risks to children’s health and development. These studies provide valuable information for the following: 1) Identification and definition of possible hazards on all major organ systems from both acute and repeated exposures; 2) Detection of potential hazards arising from in utero exposures; 3) Evaluation of the potential of a substance to affect reproduction; 4) Evaluation of the potential of a substance to damage DNA; and 5) Establishment of NOAELs (the highest exposure levels at which there are no biologically significant increases in the frequency or severity of adverse effects). From this information on hazard, a NOAEL or BMD can be determined. In the absence of human data, results from the animal model that is most relevant to humans is selected. In the absence of information to the contrary, data associated with the most sensitive species -- the species showing a toxic effect at the lowest administered dose -- is generally used in a screening-level assessment. From the suite of studies, the “critical endpoint” is the effect exhibiting the lowest NOAEL or BMD. The NOAEL or BMD is used to derive the Human Toxicity POD, which in turn is used for evaluating safety using the MOE technique. Scientific procedures, such as those employed by the U.S. EPA10 or ECHA 11 to evaluate studies to determine the quality, reliability, and adequacy of scientific information, should be used. ACC’s Center for Adavancing Risk Assessment and Science Policy (ARASP) published a review of existing methods for making such data quality determinations for in vivo and in vitro studies.12 In addition, where multiple studies exist, particularly if results across studies differ significantly, companies will need to integrate such results using a scientific process for determining the overall weight-of-the-evidence for a particular metric, effect, or outcome. Again, both the U.S. EPA13 and ECHA14 have developed specific technical guidance on weight of evidence determinations.15 A comprehensive review of review of published approaches for integrating 10 http://www.epa.gov/hpv/pubs/general/datadfin.htm. ECHA REACH Guidance on information requirements and chemical safety assessment; http://guidance.echa.europa.eu/docs/guidance_document/information_requirements_en.htm?time=1259066690. Volume 3: Chapter R.4 Evaluation of available information. 12 ACC (2012). Toxicity Data Evaluation (Method Validity, Data Quality, Study Reliability) For Hazard and Risk Assessments: Best Practices 13 See for example EPA’s Guidelines for Carcinogen Risk Assessment; http://www.epa.gov/cancerguidelines/index.htm. 14 ECHA. How to report weight of evidence; http://echa.europa.eu/documents/10162/13655/pg_report_weight_of_evidence_en.pdf 15 ACC is not endorsing either the EPA or ECHA guidance. These citations are provided solely for the purpose of documenting that a weight of evidence evaluation is a scientific procedure. In fact, the National Academy of Science has been very critical of EPA’s approach to employing weight of evidence determinations, for example the NAS Review of the Environmental Protection Agency's Draft IRIS Assessment of Formaldehyde (2011) at http://www.nap.edu/catalog.php?record_id=13142#toc. 11 6 multiple lines of evidence from different studies to arrive at an overall weight of the evidence determination has recently been completed by ACC’s ARASP.16 For estimating exposures, when conducting a screening-level evaluation, conservative screening exposure prediction models are typically used for each given scenario. The ECETOC TRA17 tool and E-FAST18 are examples of exposure modeling approaches that could be considered. These screening-level models are deterministic tools that provide a healthprotective, bounding estimate of exposure using relatively little input information. This approach results in high-end exposure estimates, often overestimating exposure. Such models are useful for screening level assessments because these estimates are very unlikely to underestimate realistic exposure (even though they often overestimate exposure). Based on the defined generic exposure (workplace and consumer) scenario, exposure models should be used to derive estimates of exposure for the relevant scenarios. By examining which types of products have the greatest overall potential for worker and/or consumer exposures, the exposure models provide quantitative estimates of chemical exposures that correspond to specific uses of specific product types and will not underestimate exposure. (Aggregate exposures can be calculated where uses may indicate simultaneous / overlapping exposures from multiple sources.) While the ICCA GPS framework is scientifically sound, each company should carefully review the parameters and defaults, as many reflect guidance by ECHA developed for Europe and REACH, and these may not be appropriate or accepted for use in other regions. For evaluating safety, the MOE is calculated the ratio of the Human Toxicity POD to the Screening Level Exposure for each scenario. Human Toxicity POD MOE = Screening Level Exposure The higher the MOE, the lower the concern. Typically, depending upon the available dataset for a given substance, for a screening level safety assessment, a MOE greater than or equal to 100 is considered health protective. The MOE is a not a bright line between “safe” and “unsafe.” Interpretation of results requires experienced risk assessors and product stewards. The derived MOEs are inherently conservative in nature for the screening level safety assessment by virtue of the fact that the top end of the predicted exposure (e.g., equivalent to the 95th percentile of likely exposures for that scenario) is used as the denominator. If the MOE is sufficiently large (e.g., >100) no additional action would generally be required. In general, if the MOE is less than 100 in the screening-level assessment, or if a particular effect observed in 16 Rhomberg et al., 2013. Best Practices for Weight-of-Evidence Evaluations http://www.ecetoc.org/tra 18 http://www.epa.gov/oppt/exposure/pubs/efast.htm 17 7 a toxicity study raises concern, then additional assessment may be required to provide more certainty in the safety assessment. This could entail additional work in hazard characterization, refinement of exposure assessment by using models, or approaches that more accurately estimate true exposures. Companies may also elect to evaluate additional product stewardship actions where the MOE indicates such considerations may be warranted. For guidance in characterizing potential risks and for communicating results to the value chain and the public, companies should refer to ACC’s GPS guidance, the ICCA-GRA or to guidance developed by EPA19 or ECHA.20 3. Evaluation of “Medium and Low Priority” Substances For medium and low priority substances, the approaches described in the ICCA-GRA framework are recommended. The initial set of information evaluated for such substances is less than that for high priority chemicals. Yet, the approach for determining safety is essentially the same for high, medium and low priority chemicals – derivation of the MOE. In addition, if an ICCA-GRA based hazard triggers for a specific toxicity test or endpoint are exceeded for substances ranked as medium or low priority, and then the GPS decision-tree would be used to guide development of additional data/information. In all cases, such “information gaps” can be filled by use of surrogate data obtained from reliable approaches, such as read-across, QSAR, DfE models, etc. Similar to the U.S. HPV program and other integrated testing and evaluation approaches, under the ICCA-GRA, toxicity data can be obtained from hazard profiling models (SAR, read across, QSAR etc.), with lab animal tests as the “last resort”, reserved until all the existing data have been evaluated. When lab animal studies are needed, then test guideline studies using validated methods should be conducted or identified to assure the highest quality of data and mutual acceptance of results. Again, while the ICCA-GRA framework is scientifically sound, each company should carefully review the parameters and defaults, as many reflect guidance by ECHA developed for Europe and REACH and these may not be appropriate or accepted for use in other regions. 4. Graphical Depiction of the Steps in Conducting a Screening Level Safety Evaluation The following graphics illustrate the steps involved in conducting a screening-level safety assessment. The language used in each graphic is for illustrative purposes and is not meant to be a “cookbook.” Each assessment should be tailored, as appropriate, based upon the properties of the chemical, its uses and exposures. In addition, as indicated above, where multiple studies exist, particularly if results across studies significantly differ companies will need to integrate such results using a scientific process for determining the overall weight-ofthe-evidence; such a weight of the evidence procedure is not included in these graphics. 19 20 http://www.epa.gov/spc/pdfs/rchandbk.pdf http://echa.europa.eu/documents/10162/13632/information_requirements_part_e_en.pdf 8 4.1. Overview of the Steps Involved in Conducting a Screening Level (Tier 1) Safety Evaluation 4.2 Steps Involved in Conducting a the Hazard Evaluation Components of the Screening Level (Tier 1) Safety Evaluation 4.2.1 Collecting and Evaluating Toxicity Data 9 4.2.2 Identifying Effects and Dose Response 4.2.3 Determining the Key Study and Critical Endpoint 10 4.2.4 Deriving the Human Toxicity POD Value 4.3 Developing the Exposure Assessment 4.3.1 Identification of Product Uses 11 4.3.2 Identification of Expousre Scenarios for Intended Uses 4.3.3 Selection of Exposure Model(s) 12 4.3.4 Derive Screning Level Esitmates of Human Exposures 4.4. Combing the Hazrad Evaluation with the Exposure Evaluation to Determine the Margin of Exposure (MOE) for the Screening Level (Tier 1) Safety Evaluation 13 4.4.1 Deriving the Margin of Exposure 5. Evaluating the Screning Level (Tier 1) Margin of Exposure to Determine “Safe for Intended Use” or “Refinement of the Hazard Evalaution or Exposure Assesment Maybe Warranted” 14