Predicting Performance in Large-Scale Identification Systems by Score Resampling Sergey Tulyakov and Venu Govindaraju {tulyakov, govind} @cubs.buffalo.edu CUBS, University at Buffalo, Amherst, NY 14228 Problem Statement: { x (i ) y1 (i ) Genuine Scores y2 (i ) yGT (i ) } - ith test identification trial Impostor Scores Sets of matching scores in identification system with GT 100 enrollees ? (Presented by Ricardo Rodrigues) In a projected identification system with same matcher and G 3000 enrollees find correct identification rate: CIR P( x( j ) max yk ( j )) 1 k G - jth identification trial ( j) 1 ( j) 2 ( j) G ( j ) } in a projected system ? {x y y y Analysis of Previous Approaches: Action: x(i ) y1 (i ) Confirming effect’s existence: y2 (i ) yGT (i ) Effect: x( k ) y1 ( k ) y2 ( k ) yGT ( k ) Mix scores from different test identification trials during prediction (e.g. to estimate distribution of impostor scores) Check the performance of a hypothetical identification system with randomly chosen impostor scores: Combination of both effects: By assuming independence of impostors and using binomial approximation we are under influence of both effects • assumption of iid genuine and impostor score from different trials • underestimation of CIR Confirming effect’s existence: Action: try to approximate: CIR Check the binomial approximation in a system with randomly chosen impostor scores: N ( t ) m ( t ) dt G cumulative distribution function of impostor scores density of genuine scores Effect: G N (t ) • small errors in estimating N (t ) lead to large errors in • overestimation of CIR K 1 N ( x) Nˆ ( x) I ( yi x) K j 1 Binomial approximation effect is dominant Score mixing effect is dominant Accidentally might get correct prediction Proposed Algorithm - Score Resampling: • Perform simulation of the larger identification system with scores chosen from available test trials of smaller identification system • Control the score mixing effect by allowing only ‘similar’ test trials in the construction of a single simulated trial Experiments with different methods on controlling score mixing effect: Selecting impostor scores from test trials having nearest genuine scores • marginally better than taking random impostors Selecting impostor scores from test trials having similar mean and variance • better than nearest genuine method • better than using T-normalization • might still give incorrect prediction Selecting impostor scores from test trials having similar nth order statistics • most close prediction among all tried methods