Analysis of scores, datasets, and models in visual saliency modeling Ali Borji, Hamed R. Tavakoli, Dicky N. Sihite, and Laurent Itti, Toronto dataset Toronto dataset Toronto dataset Toronto dataset Toronto dataset Visual Saliency Why important? Current status How good my method works? Methods: numerous / 8 categories (Borji and Itti, PAMI, 2012) Databases: Measures: scan-path analysis correlation based measures ROC analysis Benchmarks Judd et al. http://people.csail.mit.edu/tjudd/SaliencyBenchmark/ Borji and Itti https://sites.google.com/site/saliencyevaluation/ Yet another benchmark!!!? Dataset Challenge Dataset bias : Center-Bias (CB), MIT Le Meur Border effect Metrics are affected by these phenomena. Toronto Tricking the metric Solution ? • • • sAUC Best smoothing facto More than one metric The Benchmark Fixation Prediction The Feature Crises Features Low level Does it capture any semantic scene property or affective stimuli? High level people car intensity color orientation symmetry depth signs text size Challenge of performance on stimulus categories & affective stimuli The Benchmark Image categories and affective data The Benchmark Image categories and affective data vs 0.64 (non-emotional) The Benchmark predicting scanpath aAbBcCaA aA dD bB aAdDbBcCaA aAcCaA aAcCbBcCaAaA …. cC bBbBcC matching score The Benchmark predicting scanpath (scores) Category Decoding Lessons learned We recommend using shuffled AUC score for model evaluation. Stimuli affects the performance . Combination of saliency and eye movement statistics can be used in category recognition. There seems the gap between models and IO is small (though statistically significant). It somehow alerts the need for new dataset. The challenge of task decoding using eye statistics is open yet. Saliency evaluation scores can still be introduced Questions ??