Bradley-Terry Model Analysis of Cat Food Recipes Hongjie Deng1, Daniel R. Jeske2 and Ted Younglove3 Department of Statistics, University of California, Riverside, CA 92521 1Graduate Student, 2Faculty and Director of Collaboratory, 3Manager of Collaboratory Cinnamon Wheaties Introduction The Del Monte Pet Products Division of Del Monte Foods conducted palatability studies of dry cat food, wet cat food, and cat treats using paired comparison consumption tests. Multiple Comparison Procedure Model Goodness Of Fit To identify which food recipes are different with respect to cat preference, the procedure uses an algorithm to generate hypothetical tables of data under the null hypothesis H 0 : v1 v2 v3 v4 v5 . Test Algorithm The Statistical Consulting Collaboratory at the University of California, Riverside was consulted to improve the analysis of the paired comparison experiments, focusing initially on the experiments that used dry cat food. Test Result: p-value = 0.1093. Do not reject Ho at = 0.05. Calculate vˆi* ( i = 1,…,t ) for each table How Well The Model Fit Obtain 1i j t Compare the Monte-Carlo p-value with Food A Data 29-C12 Food C Food E C34= C35= Bin(29,0.5) Bin(29,0.5) Food E Food C Food D C23= C24= C25= Bin(30,0.5) Bin(30,0.5) Bin(30,0.5) - 30-C13 30-C23 - Food A - 20 22 20 1 Food B 9 - 6 7 1 Food D 30-C14 30-C24 29-C34 - C45= Bin(30,0.5) Food C 8 24 - 8 2 Food E 30-C15 30-C25 29-C35 30-C45 - Food D 10 23 21 - 3 29 27 27 D1: 10 panels comparing all pairs of recipes with 30 cats each D2: 4 panels comparing (A,B) , (B,C) , (C,D) , (D,E), each with 75 cats D2 has the minimum number of comparisons needed to estimate all the ratings and is motivated by being a simpler experiment to manage. Power Comparison Ho: v1 = v2 = v3 = v4 = v5 (i.e., no difference in recipes) Ha: not Ho Power levels for each design of a 5% test of Ho using 1000 simulated data sets are presented below. True Ratings D1 D2 (1,1,1,1,1) (1,1,1,1,1.2) (1,1,1,1.2,1.2) (1,1,1.2,1.2,1.2) (1,1,1,1,1.5) (1,1,1,1.5,1.5) (1,1,1.5,1.5,1.5) (1,1,1,1,1.8) (1,1,1,1.8,1.8) (1,1,1.8,1.8,1.8) (1,1.2,1.2,1.2,1.2) (1,1.5,1.5,1.5,1.5) 0.055 0.111 0.180 0.178 0.544 0.743 0.776 0.946 0.992 0.993 0.122 0.554 0.050 0.084 0.095 0.078 0.378 0.338 0.383 0.769 0.754 0.797 0.074 0.365 Histogram Of Q - 2500 Note: Cellij=number of cats who prefer food i over food j Frequency 2000 Bradley-Terry Model Suppose there are t treatments in an experiment involving paired comparisons. Each pair of treatments is compared by k different judges. 1500 29 1 (0.70) 2 (1.33) 3 (2.24) - Power 1000 Food E 6 (9.91) 7 (6.82) 24 (20.09) 8 (10.83) 23 (23.18) 21 (18.71) 29 (29.30) 27 (27.67) 27 (27.76) Two Alternative Designs C12= C13= C14= C15= Bin(29,0.5) Bin(30,0.5) Bin(30,0.5) Bin(30,0.5) - Food B Food B Food E 1 (2.73) Power Analysis How To Randomly Generate A Table Of Data Food A Food B 9 (5.57) Food C 8 (9.75) Food D 10 (13.41) Food E 29 (27.17) Food B Food C Food D 20 (23.43) 22 (20.25) 20 (16.59) Observed Frequencies and ( Expected Frequencies ) Conclusion The relative amount of each food, A and B, that the cats consumed over the two days was used to indicate which food they preferred. In this poster, we show how to analyze the data and select the food recipe that is most attractive to the cats. Food D Food A - for each table For each (i, j) pair, calculate Monte-Carlo p-value = ( # of Q > | vˆi vˆ j | ) /104 The experiment was conducted using a colony of 300 cats, male and female, of various breeds and ages. Each cat in a panels of 30 randomly selected cats was given two different bowls of food on each of two days. On the first day food A was placed to the left and food B was placed to the right. On the second day, the left-right orientation was reversed. Food C Q Max | vˆi* vˆ*j | Food A Experimental Design Food B Ho: Bradley-Terry model fits the data Ha: Bradley-Terry model does not fit the data Test statistic: = -2( log LHo log L(12 ,..., t 1,t ) ) where LHo is the saturated likelihood function, reduced by the Bradley-Terry link function. Randomly generate a table of data 104 times (see below) Our goal was to apply Bradley-Terry modeling and analysis techniques to the experimental data. In particular, we wanted to estimate a quality score for each food recipe, test whether the scores were significantly different, and explore the power of alternative paired comparison designs. Food A Lana Wheaties 500 Define Pr ( treatment i is preferred over treatment j ) = ij . Define rijk = rank of the i-th treatment when compared with j-th treatment by judge k. 0 Pr (Q>0.6512)=0.05 n t 2r 2r The saturated likelihood function is L(12 ,..., t 1,t ) ( ij ) (1 ij ) ijk jik 0.0 0.2 0.4 0.6 0.8 1.0 1.2 Comparison Of Standard Deviation Of Contrasts k 1 i j The Bradley-Terry link function is of treatment i. t Rank treatments based on pi ij j 1, j i Q evi e e vi vj , where vi = true rating Test Results Contrasts ij t 1 Food E Food A . Estimation Of True Ratings vi Food D Food C D1 Food B Note: Foods connected by a line are not significantly different at =0.05 Estimates Of ij Maximum Likelihood Estimates of the true ratings vi were obtained using the R software package and are presented below. Diet Food E Food A Food D Food C Food B vˆi 2.3024 0 -0.2132 -0.7307 -1.4367 pˆ i 0.9413 0.5317 0.4802 0.3535 0.1933 Rank 1 2 3 4 5 Food A Standard Deviation Food A Food B Food C Food D Food E - 0.81 0.67 0.55 0.09 Food B 0.19 - 0.33 0.23 0.02 Food C 0.33 0.67 - 0.37 0.05 Food D 0.45 0.77 0.63 - 0.07 Food E 0.91 0.98 0.95 0.93 - D2 Contrasts Standard Deviation D1 D2 v1-v2 v2-v4 0.23 0.24 0.23 0.34 v1-v3 v2-v5 0.23 0.33 0.24 0.41 v1-v4 v3-v4 0.24 0.42 0.24 0.23 v1-v5 v3-v5 0.24 0.48 0.24 0.33 v2-v3 v4-v5 0.23 0.24 0.24 0.24 Based on 1000 simulations under Ho. Results are relatively invariant to what is assumed for the true vi values . Conclusion Although D2 is simpler to manage, it has less power than D1. The loss of information by using less panels is not compensated for by using more cats in each panel. Contrasts under D1 have equal precision while for D2 they do not. Special Thanks To: Javier Suarez and Hua Yu of the UCR Statistical Consulting Collaboratory. Graduate Students of the Fall 2005 offering of STAT 293: Yingtao Bi, Mike Huang, Steward Huang, Sungsu Kim, Scott Lesch, Rupam Pal, Jose Sanchez, Jason Wilson, Rui Xiao, Karen Huaying Xu, and Qi Zhang.