Challenges in Image Analysis of Shoeprints Yoram Yekutieli Ph.D. , Sarena Wiesner M.Sc., Yaron Shor M.Sc. SAMSI Forensics Opening Workshop September 1st 2015 Outline: • • • • • • • • Tasks in image analysis of shoeprints (some examples from the work of others) Our project and the SESA system (Statistical Evaluation of Shoeprint Accidentals) Marking accidentals Shoe aligned coordinate system Orientation of accidentals Location of accidentals Shape of accidentals More issues: noise, databases Why do we need image analysis of shoeprints? • • • • • • • Identification of shoe sole pattern (shoeprint classification) Estimation of shoe sole size, orientation, wear Identification of a specific shoe Recognition of accidental characteristics and estimation of their features Image enhancement of shoeprints Visualization of shoeprint data Databases of shoeprints and accidental characteristics Identification of shoe sole pattern Robustness of the methods to changes in rigid transform : scale, translation, rotation, mirroring, Deformations, Lighting conditions, Partial data (partial shoeprints) Lab impressions vs. Crime scene prints Processing related to the pattern: Feature extraction Pattern matching How new patterns are defined (learned?) Identification of shoe sole size Either a comparison task of two images to tell if the shoes have the same size (two shoeprints, a shoeprint and a photo of the outsole) Or, given an image of a shoeprint find what is the size (in inches or cm) of the shoe that created that print? After knowing the shoe pattern (given or recognized), is there a catalogue for all shoes? Without knowing the pattern: segmentation of the figure from ground. Automatic or manual – a dedicated GUI (graphical user interface) Crime scene prints vs lab impressions Partial shoeprints Identification of shoe sole orientation A task of aligning two images of shoes (outsoles, shoeprints) vs identification of the orientation of a shoe in an image. Related to estimation of size. After knowing the shoe pattern (given or recognized), is there a catalogue for all shoes? Without knowing the pattern: Segmentation of the Figure from ground. Automatic or manual – a dedicated GUI (graphical user interface) Crime scene prints vs lab impressions Partial shoeprints Shoe sole wear A comparison task of two shoeprints to tell if they have the same wear pattern. Or given one shoeprint (crime scene or lab impression) estimate the degree of wear. How should the answer be given? A number? A map of numbers related to locations on the shoe sole? Automatic or expert estimation? Given the pattern of the shoe or without it. 3D? Identification of a specific shoe Do they match? Are they pair? Given two images, did they originated from the same shoe? (inputs) Given a big database of specific shoes (shoeprint) and a target shoe, find its closest match in the database. Do they pair? Dealing with crime scene prints, Partial prints The use of accidental characteristics In: Proc. Int. Conf. Image Processing, vol 4, pp. 441‐444 (2007) In: International Conference on Pattern Recognition, pp. 1‐4 (2008) In: Image Vision Computing 27, pp. 402 (2009) In Int. Conference on Granular Computing, pp. 459–464. IEEE (2010). https://www.researchgate.net/profile/Yoram_Yekutieli The S E S A software system Statistical Evaluation of Shoeprint Accidentals Software package Internal modules User modules Accidentals DATABASE Accidentals marking tools a. Contours a. MarkAccidentals b. Locations b. FaDeMa Database queries CheckContourS Statistical models Strength of evidence expert assistance tool CompareAccidentals what is the probability of having a specific accidental? It is the multiplication of these terms: The probability of having an accidental in this specific location The probability of having an accidental with this specific orientation The probability of having an accidental with this specific shape Marking and Matching accidentals How to mark? manually, semi‐automatically, and fully automatic. Learning texture around an accidental to robustly estimate accidental contour? What is shape (generally and on an accidental)? How should shape be defined to facilitate modeling its distribution? Should we treat the outer contour of the accidental? Its area? Its inside? Oriented edges? How to treat open accidentals vs. closed accidentals? Should the contour be continuous? Should it be composed of more than one part? What do we do with big accidentals that span many sole elements? (such as a straight, elongated scratch that creates many gaps in neighboring line segments)? Should size be a separated entity to estimate and model its distribution? How should shapes be cataloged? Classified? Saved and retrieved from databases? matched? What about partial shapes? # 384 | size 127.5813 | type 2 | Is close YES | OK # 384 | size 127.5813 | type 2 | Is close YES | OK # 397 | size 78.2432 | type 2 | Is close YES | OK # 397 | size 78.2432 | type 2 | Is close YES | OK # 389 | size 93.723 | type 2 | Is close NO | OK # 389 | size 93.723 | type 2 | Is close YES | OK # 394 | size 133.2704 | type 4 | Is close NO | OK # 394 | size 133.2704 | type 4 | Is close NO | OK # 359 | size 299.0067 | type 1 | Is close YES | OK # 359 | size 299.0067 | type 1 | Is close YES | OK Coordinate systems 1. Objective: define a shoe aligned coordinate system so location and orientation (of features on the shoeprint) will be with respect to this coordinate system. But we want to collect data on many features from multiple shoes. We need a way to align all shoes together: 2. define a universal shoe aligned coordinate system. A. All shoeprints should be aligned using this coordinate system (Even for different patterns and sizes!) B. But very similar shoes will usually be aligned much better using a direct registration of the two prints. C. So data from multiple shoes can be combined: the location of accidentals can be superimposed on one coordinate system, and we may attempt describing their distribution. The same for orientation. The diversity of shoe soles: many shapes, patterns and sizes. So what is the meaning of a location or orientation in multiple shoes? All shoeprints must be aligned. All == past, present and future shoes. A more restrictive definition – the universality of the coordinate system is to the subset of a large collection of male suspect’s shoes. How? Parabola model Arcs model Both methods were checked for consistency with the human markings and found to have les than 5 mm position error, and less than 2 deg orientation error. Orientation of accidentals Again, there are two tasks: Matching shoes (and hence matching accidentals) vs Finding the orientation of a specific accidental Orientation of an accidental with respect to the shoe aligned coordinate system Extracting the orientation from the shape: Finding the major axis and the minor axis of the shape (of the contour points of an accidental) using PCA (Principal Component Analysis) Orientation of accidentals We measured the orientation of the accidentals in our database: a constant distribution Histogram (90 bins, each of 2 degrees) of orientations of the ~8,900 accidentals of the CONTOURS data set. Orientation of accidentals Defining elongation index by the ratio of variances of contour points along the major axis and the minor axis. For a round shape this index is 1. For more elongated shapes, the measure increases: Orientation of accidentals Measuring the error in orientation as a function of the elongation index (by repeated marking of a variety of accidentals). First: for each pair (n>82,000) of repeated marking of accidentals we measured the difference in orientation (orientation error): Orientation of accidentals Calculating the orientation error as a function of the elongation index Orientation of accidentals The distribution is nearly constant, therefore the probability of finding an accidental with a specific orientation depends on the error in orientation – it is the area under the constant probability in the range of the error. Location of accidental characteristics What is the probability to find an accidental in a specific location in a shoeprint? We marked more that 13K accidentals on more than 400 shoeprints (from the Israeli police suspects datasets). Marking the center of gravity of accidentals superimposed on one coordinate system. The estimated probability density function (normalized PDF), is observed PDF (left) divided by the accumulated contact area (middle). Our collaborators from the statistics department of HUJI (Graduate student Naomi Kaplan‐Damary with Prof. Micha Mandel) work on unbiased estimators of the probability of location, under several assumptions. Does accidental mark locations follow a "homogenous Poisson process“? Estimating errors and operator variability Errors and noise is a major issue when building pattern recognition systems. The challenges here are measuring the errors of the different components of the system, including those that involve variance of the human operators. Issues of human‐machine interaction and user interfaces that are tightly related to the image analysis system add complexity. Databases and datasets Many challenges: Preparing a standard set of images (of shoe soles, shoeprints, crime scene, lab impressions, accidentals) to test algorithms and methods: To test the accuracy of a method of matching shoes, of classifying a pattern, of extracting relevant features. Anecdote: In image datasets, it was found that object recognition and classification algorithms tend to be over fitted to the datasets. The results on the performance of algorithms thus tend to be optimistic. There was a work that showed that the algorithms could be classified according to the dataset they were trained on! http://www.youtube.com/watch?v=826HMLoiE_o#t=21 Thank you