Verification of Performance Specifications An Advanced View of Method Validation Version 5.0, August 2012 This project has been funded in whole or in part with Federal funds from the Division of AIDS (DAIDS), National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, under contract No. HHSN272201200009C, entitled NIAID HIV and Other Infectious Diseases Clinical Research Support Services (CRSS). Objectives 2 Identify test classifications Define what each validation experiment details for testing methods Discuss what is recommended to perform each of the validation experiments for testing methods Recognize how to evaluate data obtained from each of the validation experiments Pre-Assessment Question #1 A rapid Human Immunodeficiency Virus (HIV) test would likely be classified as a: A. High complexity, modified assay B. Moderate complexity, unmodified assay C. Food and Drug Administration (FDA)-approved, modified assay D. Waived, FDA-approved, unmodified assay 3 Pre-Assessment Question #2 The precision of a test method gives information related to the method’s: A. B. C. D. Systematic error Comparison of results to a reference method Reproducibility Likelihood of being affected by hemolysis, lipemia and icterus E. Both A and B 4 Pre-Assessment Question #3 When transferring reference intervals of 20 specimens used, what is the minimum number that must fall within manufacturer’s reference intervals? A. B. C. D. 5 20 18 16 15 Pre-Assessment Question #4 Which linear regression equation component gives information regarding constant bias? A. B. C. D. 6 y x m (slope) b (intercept) Selecting a Method 7 Evaluate diagnostic tests Characteristics of testing methods References: Technical literature and manufacturer’s information Select method of analysis Validate method performance Implement method Perform tests with appropriate Quality Control (QC) and External Quality Assurance (EQA) Method Validation What is method validation? Why must we validate? When should we validate? What should we validate? 8 Method Validation (cont’d) 9 Why is validation important? Division of Acquired Immunodeficiency Syndrome (DAIDS) requirement How important is it that the results produced by the testing method are reliable? Shouldn’t the laboratory know the level of performance of an adopted test method? Tests to Validate Waived Non-waived • Unmodified FDA-approved 10 • Modified and/or Non-FDA-approved FDA Approval Resources 11 Vendor Publications http://www.fda.gov/MedicalDevices/ProductsandMedical Procedures/InVitroDiagnostics/LabTest/ucm126079.htm Skill Check What would you consider to be the complexity, per Clinical Laboratory Improvement Amendments (CLIA), of the glucose assay in the workbook? A. Waived B. Moderate C. High 12 Skill Check What would you consider to be the complexity of a rapid urine pregnancy assay? A. Waived B. Moderate C. High 13 Skill Check What would you consider to be the complexity of performing a manual white cell differential using a stained whole blood smear? A. Waived B. Moderate C. High 14 Method Validation 15 Before you begin: Be sure you are familiar with the test method before starting Know what to expect from the method (package insert, discussions with technical assistance, and field service representatives) Do not include results outside of stated reportable ranges Predict your findings; establish limits/evaluation criteria Terms for Discussion Central Tendency Dispersion 16 Values Terms for Discussion (cont’d) Run 17 Error in Test Methods 18 Some error is expected Examples Error must be managed Understanding Defining specifications of allowable error Measurement Total Error of Testing System • CLIA Guidelines per analyte • Other Guidelines Systematic Error 19 Random Error Total Error Error Assessment 20 Systematic Error Random Error Total Error (SE) (RE) (TE) In one direction, cause results to be high or low In either direction, unpredictable Combined effect Total Error Considerations 21 Low End Performance Standards Recommendations derived from upper portion of reportable range are more difficult to achieve at lower concentrations Maximum Total Error Allowed Considered to be 30% by David Rhoads, except for amplification methods Systematic and Random Errors 22 Systematic Error Slope/Proportional error Intercept/Constant error Bias Random Error Mean Standard deviation (SD) Coefficient of variation (CV) Tools for Use DataCrunching Tools Statistical calculators, graph paper Spreadsheets with calculations Validation Software (Westgard, AnalyzeIt, EP Evaluator) 23 How We Will Work Through This Module 24 One quantitative test taken through the validation process One qualitative method taken through the validation process Reportable Range Precision Accuracy Elements of Validation Reference Intervals Sensitivity Specificity 25 Precision Introduction What is needed How we perform the testing 26 Definition: Reproducibility Gives information related to random error 20 samples of same material (typically two levels; e.g., Glucose at 50 and 300 mg/dL) Standard solutions Control materials Pools (short term only) Repeat testing over short and long term (one day and 20 days, respectively) Precision: How We Evaluate the Data Calculate the following: Mean Standard deviation (SD) Coefficient of Variation (CV) What amount of random error is allowable, based on CLIA criteria? 27 Short term: 0.25 of allowable total error Long term: 0.33 of allowable total error Allowable Total Error Database Link for: Clinical Laboratory Improvement Amendments (CLIA) College of American Pathologists (CAP) Royal College of Pathologists of Australasia (RCPA) Others http://www.dgrhoads.com/db2004/ae2004.php 28 Values Precision: Levey-Jennings (LJ) Charts Run 29 Precision: How We Evaluate the Data How do we compare to manufacturer’s data? 30 Mean SD CV: More commonly used, allows for easier comparison Precision Example Mean of Level 1 Glucose 90 mg/dL CLIA Total Allowable Error 6 mg/dL or ± 10% Total Allowable Error Level 1 Glucose 0.1 x 90 = 9 mg/dL Random error allowed: 0.25 x total allowable Short-term precision 0.25 x 9 mg/dL 2.25 mg/dL 31 0.33 x total allowable Long-term precision 0.33 x 9 mg/dL 2.97 mg/dL Activity 32 Work with Levey-Jennings graph and data Work with mean and standard deviation to calculate a coefficient of variation, as well as a mean and a coefficient of variation to calculate a standard deviation Determine if precision data is acceptable Accuracy Introduction What is needed How we perform the testing 33 Definition: How close to the true value Comparison of methods Gives information related to systematic error Potential conflicts on interpretation of results (reference values) 40 different specimens Cover reportable range of method Quality versus quantity Duplicate measurements of each specimen on each method Minimum of five days, prefer over 20 (since replicate testing is same) Accuracy: How We Evaluate the Data Graph the Data: Real time 34 Difference plot Comparison plot Calculate y = mx + b Test method on Yaxis b represents constant error Reference (comparative) method on X-axis m represents proportional error Shows analytical range of data, linearity of response over range and relationship between methods Visual Inspection for Accuracy Test Method (x1, y1) (x2, y2) Slope = (y2- y1) / (x2- x1) Intercept Reference Method 35 Accuracy: How We Evaluate the Data 36 Slope: Usually not significantly different from 1 Intercept: Not significantly different from 0 Significant difference with Medical Decision Points Calculate Appropriate Statistics Slope Measure of proportional bias m = (y1-y2)/(x1-x2) or “rise/run” Slope greater than 1 means the Y (Test) values are generally higher than the X (Comparative) values Slope of 1.11 means the Y (Test) values are on average 11% higher than the X (Comparative) values 37 Calculate Appropriate Statistics (cont'd) Intercept of the Line Measure of constant bias between two methods Y (Test) value at the point where the line crosses the Y axis If Y intercept is 12, then all Y (Test) values are at least 12 units higher than the X (Comparative) values 38 Accuracy What type of bias do you see? 39 Accuracy (cont’d) Constant Bias 40 Proportional Bias Skill Check Can a linear regression formula offer predictive value in relation to method comparisons? A. Yes B. No 41 Activity 42 Create graph based on sample set Determine slope from best-fit line Determine Y-intercept from best-fit line Explain the relationship between comparative and test results Reportable Range / Linearity Introduction What is needed How we perform the testing 43 Definition: Lowest and highest test results that are reliable Especially important with two point calibrations Analytical Measurement Range (AMR) and derived Clinical Reportable Range (CRR) Series of samples of known concentrations (e.g., standard solutions, EQA linearity sets) Series of known dilutions of highly elevated specimen or spiked specimens; EQA specimens At least four levels (five preferred) CLSI recommends four measurements of each specimen; three are sufficient Reportable Range: How We Evaluate the Data Plot mean values of: Measured values on Y-axis versus Known or assigned values on X-axis Visually inspect, draw best-fit line, estimate reportable range Compare with expected values (typically provided by manufacturer) 44 Reportable Range Activity Assigned Value 45 Experimental Results Average Rep #1 Rep #2 Rep #3 Rep #4 10.0 ____ 11.0 10.0 11.0 10.0 100.0 ____ 99.0 103.0 103.0 101.0 300.0 ____ 303.0 305.0 304.0 306.0 500.0 ____ 505.0 506.0 505.0 506.0 800.0 ____ 740.0 741.0 744.0 742.0 Reportable Range Activity (cont'd) Assigned Value 46 Experimental Results Average Rep #1 Rep #2 Rep #3 Rep #4 10.0 10.5 11.0 10.0 11.0 10.0 100.0 101.5 99.0 103.0 103.0 101.0 300.0 304.5 303.0 305.0 304.0 306.0 500.0 505.5 505.0 506.0 505.0 506.0 800.0 741.8 740.0 741.0 744.0 742.0 Reportable Range Activity (cont'd) Linearity Scatter Plot 800.0 700.0 Recovered Values (Means) 600.0 500.0 400.0 300.0 200.0 100.0 0.0 0 100 200 300 400 500 As s igne d Conce ntrations (units ) 47 600 700 800 900 AMR vs. CRR Analytical Measurement Range (AMR) Linearity Clinically Reportable Range (CRR) Allows for dilution or other preparatory steps beyond routine 48 Skill Check If you do not have enough specimen to perform a dilution, upon which reportable range component must you rely? A. B. C. D. 49 AMR CRR Neither A or B Both A and B Linearity Materials Utilizing the marketing materials from the two chemistry linearity kits in your handouts: 1. Determine which kit would be more appropriate for use with the chemistry assay you chose earlier 2. Explain your reasoning 50 Graph Activity Given your choice of linearity kits, you perform your AMR experiments by performing four replicates of each level of known concentration solution. The data you obtain is displayed on the next slide. 1. Review data; record any initial observations 2. Graph data on supplied graph paper 3. Determine your assay’s AMR 51 Linearity Experiment Results 52 Level Rep 1 Rep 2 Rep 3 Rep 4 1 24 23 25 24 2 196 197 171 194 3 359 360 358 361 4 530 532 529 535 5 700 695 702 709 Activity Using an Excel spreadsheet, create a graph and calculate linear regression statistics from the data provided 53 54 Rep 1 Rep 2 Rep 3 Rep 4 Lab's Average Known Conc 24 23 25 24 24 25 196 197 171 194 195.7 200 359 360 358 361 360 375 530 532 529 535 532 550 700 695 702 709 702 725 Recovered AMR Verfication 800 700 600 500 400 300 200 100 0 0 200 400 Known Concentration 55 600 800 Dilution Protocols 56 Your medical director, in consultation with clinicians, determines that for proper study participant care the Clinically Reportable Range (CRR) for glucose is 15 – 1400 mg/dL Given your linearity experiment results and the package insert, devise a dilution protocol to be contained within our Glucose SOP Reportable Results Given your AMR, CRR, and dilution protocol, how would you handle the following analyzer results? 1. 12 mg/dL 2. 800 mg/dL 3. 1600 mg/dL 57 Reference Intervals Introduction What is needed How we perform the testing 58 Definition: Normal range in healthy population Used for diagnosis/clinical interpretation of results Pre-defined “normal” criteria for screening purposes Transferring: 20 “normal” individuals’ specimens Establishing: 120 “normal” individuals’ specimens Perform testing on all samples Document results Reference Intervals: How We Evaluate the Data Transferring 59 18 of 20 must fall within manufacturer’s ranges Establishing Calculate mean and SD of data for each group Reference Intervals = mean ± 2 SD (if Gaussian Distribution only, otherwise, additional calculations recommended) Activity 60 Determine if assay is eligible for transference of reference intervals Review a sample set of data to determine if transference may be performed; if not, determine next step(s) Sensitivity Introduction What is needed How we perform the testing 61 Definition: Lowest reliable value; lower limit of detection, especially of interest in drug testing and tumor markers Different terminologies used by different manufacturers Blank solutions Spiked samples 20 replicate measurements over short or long term, depending on focus Sensitivity: How We Evaluate the Data Three methods used: Biological Limit of Functional Lower Limit of Detection: Sensitivity: Detection (LLD): LLD plus two or Mean concentration Mean of the blank sample, plus two or three times SD of for spiked sample three SD of blank spiked sample with whose CV = 20%; concentration of lowest limit where sample detection limit quantitative data is reliable 62 Activity Using the manufacturer’s package inserts, find the related information for sensitivity. How was it calculated? 63 Specificity Definition: Determination of how well a method measures the analyte of interest accompanied by potential interfering materials Introduction 64 What is needed Standard solutions, participant specimens or pools Interferer solutions (standard solutions, if possible; otherwise, pools or specimens) added at high concentrations How we perform the testing Duplicate measurements Specificity: How We Evaluate the Data 65 Tabulate results for pairs of samples (dilution and interferent) Calculate means for each (dilution and interferent) Calculate the differences Calculate the average interference of all specimens tested at a given concentration of interference Qualitative Assays 66 Compare diagnosis Assume comparative (reference) method is accurate Determine the following: True Positives, True negatives False Positives, False negatives Calculate sensitivity and specificity and compare to manufacturer Qualitative Assays: Control of Validation 67 Negative and Positive Quality Controls Use QC materials recommended by manufacturer for verification purposes Determine validity of other results, e.g., method comparisons Evaluate failed runs if they occur during verification process Qualitative Methods: Precision 68 How is it performed? Runs of specimens with analyte concentrations near the cutoff point Three specimens, one at cutoff, one just below cutoff, and one just above cutoff (± 20% recommended) Replicate measurements of each of three specimens (20 each, minimum) How is it evaluated? Determine percentage of positives and negatives for each specimen Evaluate cutoff, as well as other two specimens Accuracy/Method Comparisons 69 How is it performed? Specimens typical of population (to be tested in future use of method) 50 positive specimens and 50 negative specimens recommended; minimum 20 each Performed over 10 to 20 days How is it evaluated? Discrepant results near cutoff? Most often sensitivity and specificity used to describe performance Qualitative Methods Comparative or Reference Method Result True vs. False Test Method Result Positive Negative 70 Positive Negative True Positive False Positive Positive Predictive Value False Negative True Negative Negative Predictive Value Sensitivity Specificity False Positive Rate - False Positives divided by total number of Negatives False Negative Rate - False Negatives divided by total number of Positives Qualitative Methods (cont'd) Comparative or Reference Method Result Test Method Result Positive Negative 71 Positive Negative True Positive False Positive Positive Predictive Value False Negative True Negative Negative Predictive Value Sensitivity Specificity Sensitivity = 100 x True Positives divided by (True Positives + False Negatives) Specificity = 100 x True Negatives divided by (True Negatives + False Positives) Qualitative Methods (cont'd) Comparative or Reference Method Result Test Method Result Positive Negative 72 Positive Negative True Positive False Positive Positive Predictive Value False Negative True Negative Negative Predictive Value Sensitivity Specificity Predictive Values - Operation of a test on a mixed population of Positive and Negatives A property of the test and the population; and affected by prevalence of Positives Positive Predictive Value = True Positives divided by (True Positives + False Positives) Negative Predictive Value = True Negatives divided by (True Negatives + False Negatives) Evaluation Criteria 73 High Diagnostic Value 100% Sensitivity 100% Specificity What happens if True Positive rate is equal to the False Positive rate? Activity Estimate sensitivity and specificity of a qualitative method given a data set. 74 Activity (cont’d) Create a validation plan for a quantitative assay to be performed in your laboratory. 75 In Closing Now that you have completed this module, you should be able to: 76 Identify test classifications Define what each validation experiment details for testing methods Discuss what is recommended to perform each of the validation experiments for testing methods Recognize how to evaluate data obtained from each of the validation experiments Post-Assessment Question #1 A rapid HIV test would likely be classified as a: A. B. C. D. 77 High complexity, modified assay Moderate complexity, unmodified assay FDA-approved, modified assay Waived, FDA-approved, unmodified assay Post-Assessment Question #2 The precision of a test method gives information related to the method’s: A. B. C. D. Systematic error Comparison of results to a reference method Reproducibility Likelihood of being affected by hemolysis, lipemia and icterus E. Both A and B 78 Post-Assessment Question #3 When transferring reference intervals of 20 specimens used, what is the minimum number that must fall within manufacturer’s reference intervals? A. B. C. D. 79 20 18 16 15 Post-Assessment Question #4 Which linear regression equation component gives information regarding constant bias? A. B. C. D. 80 y x m (slope) b (intercept) References 81 DAIDS Good Clinical Laboratory Practice (GCLP) Guidelines. www.westgard.com. Validation of Qualitative Methods. 42 CFR § 493.1253. College of American Pathologists Commission on Laboratory Accreditation, Accreditation Checklists, April 2006. Westgard, James O. Basic Method Validation 2nd Edition. Madison, WI: Westgard QC, Inc., 2003. Clinical and Laboratory Standards Institute. User Protocol for Evaluation of Qualitative Test Performance; Approved Guideline. NCCLS document EP12-A. Clinical and Laboratory Standards Institute, Wayne, PA USA, 2002. Clinical and Laboratory Standards Institute. Evaluation of Precision. Performance of Quantitative Measurement Methods. NCCLS document EP5-A2. Clinical and Laboratory Standards Institute, Wayne, PA USA, 2004. Clinical and Laboratory Standards Institute. User verification of Performance for Precision and Trueness. CLSI document EP15-A2. Clinical and Laboratory Standards Institute, Wayne, PA USA, 2005. Wrap Up 82