ISQC2013 A note on precision of qualitative data Tomomichi Suzuki, Tokyo University of Science szk@rs.tus.ac.jp Yusuke Tsutsumi, Mitsubishi Tanabe Pharma Corporation Natsuki Sano, Tokyo University of Science 1 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Introduction of myself • I focus on “Statistical Data Analysis” that will bridge the gap between theory and practice. • I am attending ISQC for the fourth time • Warsaw 2004, “Effective Dynamic Process Control of Assembly Processes” – statistical control of assembly process with time dependent noise • Beijing 2007, “A Study on Adaptive Paired Comparison Experiments” – design of experiment for paired comparison – proposal on the Swiss tournament system • Seattle 2010, “Improving Taguchi’s linear graphs for split-plot experiments” – proposed new linear graphs for expressing interaction of whole-plots and sub-plots 2 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Outline of Today’s Talk • • • • • Introduction Precision for Quantitative Data Precision for Qualitative Data Comparison Conclusions 3 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Outline of Today’s Talk • • • • • Introduction Precision for Quantitative Data Precision for Qualitative Data Comparison Conclusions 4 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Introduction • ISO 5725 accuracy (trueness and precision) of measurement methods and results • Tests performed on presumably identical materials in presumably identical circumstances do not, in general, yield identical results. 5 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Introduction • ISO 5725 accuracy (trueness and precision) of measurement methods and results • Trueness: – refers to the closeness of agreement between the arithmetic mean of a large number of test results and the true or accepted reference value. • Precision: – refers to the closeness of agreement between test results. 6 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 ISO/TC 69/SC 6 • ISO/TC 69 (Application of Statistical Methods)/SC 6 (Measurement Methods and Results)/WG1 (Accuracy of measurement methods and results) is preparing a document (TR: Technical Report) on precision of qualitative data. Now in Preliminary Work Item. • Reviewed existing methods and established methods. 7 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 ISO/TC 69/SC 7 • ISO/TC 69 (Application of Statistical Methods)/SC 7 (Six Sigma) published ISO TR 14468 “Selected illustrations of attribute agreement analysis” • This is based on kappa coefficient approach. 8 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 ISO/TC 34/SC 9 • ISO/TC 34 (Food products) /SC 9 (Microbiology) produced ISO 16140 “Microbiology of food and animal feeding stuffs — Protocol for the validation of alternative methods” in 2003. • It includes method by Langton et al. (2002) • ISO/TC 34/SC 9 is revising ISO 16140 “Microbiology of food and animal feed — Method validation — Part 2: Protocol for the validation of alternative (proprietary) methods against a reference method”. • It includes method by Wilrich. 9 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 AOAC International ISO/TC 34/SC 16 • ISPAM (International Stakeholder Panel on Alternative Methods) produced a document on “Guidelines for Validation of Qualitative Chemistry Methods” which is based on POD model proposed by P. Wehling et al. (2011) • This is the main part of the ISO/TC 34 (Food products) /SC 16 (Horizontal methods for molecular biomarker analysis) document. “Validation Scheme for Qualitative Analytical Methods” (possible alternative title: "Performance characteristics and validation of binary measurement methods") 10 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 IMEKO/TC 21 • IMEKO (International Measurement Confederation) / TC 21 (Mathematical Tools for Measurements) hold SIG (Special Interest Group) “Precision evaluation in non-quantitative measurements”. 11 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Purpose • Many methods are proposed for qualitative data, but their effectiveness and statistical properties are not so clear. • This paper introduces the methods to evaluate precision for qualitative data, then proposes a method using logit model. The proposed method is compared with existing methods. 12 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Outline of Today’s Talk • • • • • Introduction Precision for Quantitative Data Precision for Qualitative Data Comparison Conclusions 13 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Collaborative Assessment Experiment • Every laboratory measures the identical test item number of times. Laboratory Run 1 ... Run k ... Run n 1 y11 y1k y1n 2 y21 y2k y2n yi1 yik yin yL1 yLk yLn : : i : : L 14 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Precision for Quantitative Data • Repeatability: – is the precision under repeatability conditions – where independent test results are obtained with the same method on identical test items in the same laboratory by the same operator using the same equipment within short intervals of time. – Repeatability indicates the smallest variation for a particular measurement method. • Reproducibility: – is the precision under reproducibility conditions – where test results are obtained with the same method on identical test items in different laboratories with different operators using different equipment. – Reproducibility indicates the largest variation for a particular measurement method. 15 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Precision for Quantitative Data • Model used in ISO 5725 y=m+B+e – y is the measurement result – m is the general mean (expectation) – B is the laboratory component of bias under repeatability conditions (variance sL2) – e is the random error in every measurement under repeatability conditions. (variance se2) 16 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Precision for Quantitative Data • Repeatability variance sr2 sr2 = se2 , or s r V (e) • Reproducibility variance sR2 sR2 = sL2 + sr2 = sL2 + se2 , or s R V ( B) V (e) (1) • The estimates of repeatability variance and reproducibility variance are calculated from interlaboratory studies or collaborative assessment experiments. 17 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Precision for Quantitative Data • Gauge R & R • Many objects are measured (there are variation in products) • Gauge Repeatability = Repeatability in ISO 5725 (sr2) • Gauge Reproducibility ≠ Reproducibility in ISO 5725 • Gauge Reproducibility = Between Laboratory Variance in ISO 5725 (sL2) sR2 = sL2 + sr2 (1) 18 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Outline of Today’s Talk • • • • • Introduction Precision for Quantitative Data Precision for Qualitative Data Comparison Conclusions 19 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Precision for Qualitative Data • Non-quantitative measurements – binary data, categorical data, ordinal data, etc. • In this paper, the methods to evaluate precision for binary data are considered. • Methods compared are – – – – Accordance and concordance (Langton’s) Attribute agreement analysis (Kappa) van Wieringen’s method Proposed method 20 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Precision for Quality Data • The value of yik is either 0 (negative, non-detect, fail, etc.) or 1 (positive, detect, pass, etc.). Table 1 Laboratory Run 1 ... Run k ... Run n 1 y11 y1k y1n 2 y21 y2k y2n yi1 yik yin yL1 yLk yLn : : i : : L 21 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Qualitative Methods 0. ISO Based Method yij i eij Wilrich’s model y m Be ISO5725 model Repeatability Variance n 1 L sˆ ˆi 1 ˆi n 1 L i 1 2 r Inter-laboratory Variance Component :general mean eij :laboratory component of bias :random error i ˆi :Estimate of probability of detecting for lab i (i=1, 2, …, L) n :number of repetitions (measurements) 1 L 1 1 L 2 ˆ ˆ ˆ ˆ sˆ max 0, 1 i i i n 1 L i 1 L 1 i 1 2 L Reproducibility Variance sˆ R2 sˆ r2 sˆ L2 22 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Qualitative Methods 1. Accordance and Concordance Lab i 1 Accordance 1 1 xi ( xi 1) (n xi )(n xi 1) Ai n(n 1) 1 0 matching probability Ai 1 L A Ai L i 1 Ai :Accordance for laboratory i (i1, 2, ..., L) A :Accordance xi :number of ‘detect’ for lab i (i1, 2, ..., L) n :number of repetitions (measurements) 23 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Qualitative Methods 1. Accordance and Concordance lab i’’ lab i 1 Concordance 1 1 1 matching probability Ci 0 L 2 xi xi nL nLnL 1 Ai nL(n 1) Ci i 1 i 1 n 2 L( L 1) lab i’ 1 1 1 1 1 0 0 1 1 1 L 1 L C Ci L i 1 Ai :Accordance for laboratory i (i1, 2, ..., L) Ci :Concordance for laboratory i (i1, 2, ..., L) C :Concordance xi :number of ‘detect’ for lab i (i1, 2, ..., L) n :number of repetitions (measurements) 24 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Qualitative Methods 1. Accordance and Concordance • Relation between ISO based method and accordance, concordance 1 A sˆ r2:ISO 5725 based 2 A : Accordance , 1 C sˆ R2:ISO 5725 based 2 C : Concordanc e 25 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Qualitative Methods 2. Van Wieringen et al(2008) X ijh :measurement ・binary 1: detect, pass 0: no-detect, fail model Yi PX i ,1,1 , X i ,1, 2 ,..., X i ,m,l Yi PX ijh Yi j ,h :true value i 1,...,n j 1,...,m h 1,...,l items appraisers repetitions true probability of ‘pass’ P(Yi 1) Sensitivity j 1 Specificity 1 j 0 where j y P X ijh 1Yi y latent class model PX ijh x 1 j (0) x 1 j (0) j (1) x 1 j (1) 1 x 1 x 26 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Qualitative Methods 2. Van Wieringen et al(2008) Likelihood when , 1 1,..., m 1, 1 0,..., m 0 T m m l l l R R l Rij Rij LR; 1 1 j 0 j 0 1 j 1 ij j 1 ij i 1 j 1 Rij j 1 Rij n where Rij h 1 X ijh l Maximum Likelihood Estimate using EM algorithm Maximum Value: LRR 1 (0) m (0) and 1 (1) m (1) n m l ml Ri Ri m l ml Ri Ri j 1 LR; 1 1 j 0 j 0 1 j 1 i 1 Ri Ri Likelihood when Maximum Likelihood Estimate using EM algorithm Maximum Value: LR 27 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Qualitative Methods 2. Van Wieringen et al(2008) Repeatability Reproducibility 28 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Qualitative Methods 3. Attribute Agreement Analysis Fleiss’ Kappa Statistic Pˆo Pˆe ˆ 1 Pˆe n M 1 2 ˆ Po x nm ij nm(m 1) i 1 j 1 M Pˆe p 2j j 1 Pˆo :Probability that results actually matched n :number of items Pˆe :Probability that results match by chance m :number of appraisers p j :Ratio of category j 1 1 M :number of categories xij :number of item i categorized as j 1 complete agreement 0 the same as chance (no correlation) 1 complete non-agreement within appraisers 29 /41 between appraisers 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Qualitative Methods 4. Proposed Method • We propose the method of estimating repeatability and reproducibility using the logit transformation. • When the number of positive results xi follows a binomial distribution with parameters n and pi, then logit transformation of pi* asymptotically follows a normal distribution as follows. i 1 Li ~ N ln , 1 i n i 1 i where Li Logit i * ln i * 1 i * (4) and i * x i 0 .5 n 1 30 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Qualitative Methods 4. Proposed Method • When we consider xi as the measurement result of laboratory i, the variances can be estimated by means of one-way ANOVA as shown below. • Repeatability Variance 1 L 1 ˆ s L i 1 nˆi 1 ˆi 2 r (6) • Reproducibility Variance 2 L L 1 2 2 sˆ R Li Li L L 1 i 1 i 1 (7) 31 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Outline of Today’s Talk • • • • • Introduction Precision for Quantitative Data Precision for Qualitative Data Comparison Conclusions 32 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Comparison of methods • Methods are compared using the same set of data in order to clarify the relation of among the methods. – – – – Accordance and concordance, Attribute agreement analysis (Kappa) van Wieringen’s method Proposed method • We compared the methods by averaging the obtained precision measures in the case for Langton's method and the proposed method. • The parameters are set based on van Wieringen's method. 33 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Comparison of methods • Values of parameters (of figures next page) – overall (probability of an item being conforming): 0.5 – number of items: 200 – number of raters L: 3 – number of repetitions for each rater n: 3 – probability of evaluating conforming item as pass: 0.99, 0.95, 0.90 – probability of evaluating nonconforming item as pass: 0.01, 0.05, 0.10 (those probabilities are for raters 1 to 3 respectively) 34 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Comparison of methods Results • Repeatability(above) and Reproducibility(below) 1200 1150 1100 1050 1000 0.76 11.0 0.75 10.9 0.74 10.8 0.73 10.7 logit(r) 1250 κ within raters van Wieringen(r) 1300 0.72 0.71 0.70 0.84 0.85 0.86 0.87 0.88 10.4 10.3 0.68 10.2 0.89 10.1 0.83 0.84 0.85 Accordance 0.86 0.87 0.88 0.89 0.83 0.84 0.85 Accordance 140 0.78 120 0.76 0.86 0.87 0.88 0.89 Accordance 7.45 7.40 100 80 60 40 20 7.35 7.30 0.74 logit(R) κ between raters van Wieringen(R) 10.5 0.69 0.67 0.83 10.6 0.72 0.70 7.25 7.20 7.15 7.10 7.05 0.68 7.00 0 0.66 0.82 0.83 0.84 0.85 0.86 Concordance 0.87 0.88 0.89 6.95 0.82 0.83 0.84 0.85 0.86 Concordance 0.87 0.88 0.89 0.82 0.83 0.84 0.85 0.86 0.87 0.88 0.89 Concordance • Strong relation among the methods. But not identical. 35 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Comparison of methods Results • Accordance and concordance, attribute agreement analysis and the proposed method gave very similar results. • The method proposed by van Wieringen also gave similar results but the relationship was not as strong. • The reason for giving different precision measures should be investigated. 36 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Outline of Today’s Talk • • • • • Introduction Precision for Quantitative Data Precision for Qualitative Data Comparison Conclusions 37 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Conclusions • The method utilizing logit transformation is proposed. • The proposed method and other existing methods are compared using the same set of data. • Accordance and concordance, attribute agreement analysis and the proposed method gave very similar results. • The method proposed by van Wieringen also gave similar results but the relationship was not as strong. • Other methods (POD models etc.) should also be compared. How to compare is the problem. • It would be expected that these findings contribute to standardization of evaluating precision of binary measurements. 38 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 References • Danila O., Steiner, S. H., and Mackay R. J. (2008). Assessing a Binary Measurement System. Journal of Quality Technology, 40, 310-318. • Fleiss, J. L. (1981). Statistical Methods for Rates and Proportions. 2nd edition, John Wiley & Sons. • Horie K., Tsutsumi Y., Suzuki T. (2008). Calculation of Repeatability and Reproducibility for Qualitative Data. Proc. 6th ANQ Congress, 12 pages (CDROM). • ISO 5725 (1994). Accuracy (trueness and precision) of measurement methods and result – Part 1 : General principles and definitions. • ISO 5725 (1994). Accuracy (trueness and precision) of measurement methods and result – Part 2 : Basic methods for the determination of repeatability and reproducibility of a standard measurement methods. • ISO/TR 14468 (2010). Selected illustrations of attribute agreement analysis. • Langton, S.D., Chevennement, R., Nagelkerke N., and Lombard B. (2002). Analysing collaborative trials for qualitative microbiological methods: accordance and concordance. International Journal of 39 /41 Food Microbiology, 79, 175-181. 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 References • Mandel, J. (1997). Repeatability and Reproducibility for Pass/Fail Data. Journal of Testing and Evaluation, 25, 151-153. • Van der Voet, H. and van Raamsdonk L. W. D. (2004). Estimation of accordance and concordance in inter-laboratory trials of analytical methods with qualitative results. International Journal of Food Microbiology, 95, 231-234. • Wehling, P., LaBudde, R.A., Brunelle, S. L., and Nelson, M. T. (2011). Probability of Detection (POD) as a statistical model for the validation of qualitative methods. Journal of AOAC International, 94, 335-347. • Van Wieringen, W. N., and de Mast, J. (2008). Measurement System Analysis for Binary Data. Technometrics, 50, 468-478. • Wilrich, P.-Th. (2010). The determination of precision of qualitative measurement methods by interlaboratory experiments. Accreditation and Quality Assurance, 15, 439-444. 40 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science ISQC2013 Thank you for your attention! 41 /41 2013-08-22 Tomomichi Suzuki, Tokyo University of Science