Statistics of Fingerprints Dakota Boyd, Dustin Short, Elizabeth Lee, John Huppenthal, Shelby Proft, Wacey Teller History of Fingerprinting Originally used paper and ink fingerprints Fingerprints were matched using trained individuals Initially, each country has its own standards Digital fingerprinting lead to international standards Fingerprints can now be matched or partially matched using algorithms Section 6.1-6.3 from The Fingerprint Sourcebook Problems with Automated Fingerprint Processing Systems Digital Fingerprint acquisition Image enhancement Feature/Minutiae extraction Matching Indexing/retrieval Section 6.4.1 from The Fingerprint Sourcebook Fingerprint Acquisition Ink and paper method Latent prints Livescan images – fingerprint sensors FTIR optical scanner Capacitive scanner Piezoelectric scanner Thermal scanner Figure 6-6 from The Fingerprint Sourcebook Image Enhancement Many acquisition types leads to many noise characteristics Enhancement algorithms help correct unwanted noise Latent Print Enhancement Automated Enhancement Figure 6-10 from The Fingerprint Sourcebook Feature Extraction Binarization algorithm – Black is ridges, white is valleys Thinning algorithm leads to the thinned image or skeletal image Minutiae detection algorithm locates the x, y, and theta coordinates of the minutiae points Minutiae post processing algorithm to detect false minutiae Section 6.4.4 from The Fingerprint Sourcebook Matching Factors that influence matching from fingerprint acquisition: displacement, rotation, partial overlap, nonlinear distortion, pressure, skin conditions, noise from imaging, errors from feature extraction First need to establish alignment Programs may use core and delta points to align fingerprints Could use Hough transform Then match minutiae Fingerprint is then given a matching score High = high probability fingerprint are a match Low = low probability the fingerprints are a match Section 6.4.5 from The Fingerprint Sourcebook Indexing Fingerprints Need to be able to index and retrieve fingerprints of a given individual Before digital fingerprints, forensic experts used filing cabinets to organize prints using a classification system Prints are explicitly classified by overall shape: right loop, left loop, whorl, arch, tented arch, and double loop Can be continuously classified using vectors Section 6.4.6 from The Fingerprint Sourcebook The Galton Model First probability model for fingerprint individuality (1892). Variously sized square papers dropped over sections of a fingerprint, and a prediction of whether or not the paper cover minutiae. Model not based on actual distribution or frequency of minutiae. Estimated probability of different pattern types present and the number of ridges in the selected region of the print. Probability of finding any given minutiae in a fingerprint given as 1 in 68 billion. The Osterburg Model 1977-1980 Divide fingerprint into 1 sq. mm sections and count the occurrence of 13 different minutiae appearances in each section. Rarity of a fingerprint arrangement = product of all individual minutiae frequencies and empty cells. Example: 72 sq mm fingerprint, 12 ridge endings, each in one cell, 60 empty cells, probability = (0.766)60 (0.0832)12 = 1.25 x 10-20. 0.766 and 0.0832 are Osterburg’s observed frequencies of an empty cell and a ridge ending. Problem: This model assumes each cell/section event is independent. The Stoney and Thornton Model 1985-1989 Determined criteria for an ideal model to calculate individuality of a fingerprint and the probabilistic strength of a match. Each minutiae pair is described by the six characteristics and the spatial position of the pair within the entire fingerprint. Classifying Characteristics Ridge structure and description of minutiae locations. Descriptions of minutia distribution. Orientation of minutiae. Variation in minutiae types. Variation among prints from the same source. Number of orientations and comparisons. The Pankanti, Prabhakar, and Jain Model 2001 Model assesses probabilities of false matches, not individuality of fingerprints. Calculates the number of possible arrangements of ridge endings and bifurcations. Calculated spatial differences of minutiae in pairs, and accept similar spatial calculations as matches. (x, y, θ). Each fingerprint had four captures, separated in two databases, to determine an acceptable tolerance of error based on natural variations. First Level Detail Direction of ridge flow in the print. Not necessarily defined to a specified fingerprint pattern. General direction of ridge flow is not unique. Second Level Detail Pathway of specific ridges. Includes starting position, path of the ridge, length, and where the ridge path stops. Includes configurations with other ridge paths. Uniqueness is found with the ridge path, length, and terminations. A general direction must exist (first level detail). Third Level Detail Shapes of the ridge structures. Morphology of the ridge: edges, textures, and pore positions on the ridge. Shapes, sequences, and configurations of third level detail are unique. General direction (first level) and a specific ridge path (second level) must exist for third level detail. Persistence Comparing the visibility of minutiae in fingerprints over a time span. Galton found one discrepancy, where a single bifurcation was not present 13 years later. Other studies with age spans ranging up to 57 years found no discrepancies of minutiae. All in first and second level detail. Persistence Pores on the ridges of friction ridge skin remain unchanged throughout life. Their location remains the same. Palm creases (third level detail of the palm) have seen changes over long time periods. Due to age of the skin, skin flexibility, and other factors. All in third level detail. Persistence Basal layer (regenerative layer between dermis and epidermis). Friction ridge skin persistency is maintained by the regenerative cells in the stratum basale, and the connective relationship of these cells. Examination Method Analysis, comparison, evaluation (ACE) and verification (V) This is one description of a method of comparing details, forming a hypothesis about the source, experimenting to determine whether there is agreement or disagreement, analyzing the sufficiency of agreement or disagreement, rendering an evaluation, and retesting to determine whether the conclusion can be repeated. Examination Method Analysis The assessment of a print as it appears on the substrate. Makes the decision of whether the print is sufficient for comparison with another print Looks at the substrate, matrix, development medium, deposition pressure, pressure and motion distortion, and development medium for appearance and distortion Examination Method Comparison Determine whether the details in two prints are in agreement based upon similarity, sequence, and spatial relationship occurs in the comparison phase Because no print is ever perfectly replicated, mental comparative assessment consider tolerance for variations in appearance caused by distortion Makes comparative measurements of first, second, and third level details are made along with comparisons of the sequences and configuration of ridge paths Examination Method Evaluation The formulation of a conclusion base upon analysis and comparison of friction ridge skin The examiner makes the final determination as to whether a finding of individuation or same source of origin can be made Makes comparative measurements of first, second, and third level details are made along with comparisons of the sequences and configuration of ridge paths Examination Method Recurring, Reversing, and Blending Application of ACE The examiner can change the phase of the examination often re-analysis, re-compares, and reevaluates. There is no clear linear path to this ACE process because the decision of choosing whether the two fingerprints are the same complicates things. Examination Method Because of the ambiguity of the process the colored diagram is used to illustrate the process. The critical application of ACE is represented in the model by red area A, green area C and blue area E The actual examination is represented in the model by threee smaller circles with capital A, C, and E. Examination Method The black dot in the center represents the subconscious processing of detail in which perception can occur The gray represents other expert knowledge, beliefs, biases, influences and abilities. The white that encircles the grey represents the decision has be made Many evaluation take place. Eventually the final analysis and comparison lead to the final evaluation Examination Method Verification The independent examination by another qualified examiner resulting in the same conclusion It is another person going through the ACE process of verifying if the two prints conclusion are the same The verifier must not know the decision of the previous conclusion to get decisions that is nonbiased Decision Thresholds Decisions must be made within each phase of ACE whether to go foreword, backwards, or to stop in the examination process must be decided History of threshold: New Scotland Yard adopted a policy (with some exceptions) of requiring 16 points The FBI abandoned the practice of requiring a set number of points The IAI (International Association for Identification) formed a committee to determine the minimum number of friction ridge characteristics which must be present in two impressions in order to establish positive identification Decision Thresholds The prevailing threshold of sufficiency is the examiners determination that sufficient quantity and quality of detail exists in the prints being compared Quantitative-qualitative threshold (QQ) For impressions from volar skin, as the quality of details in the prints in creases, the requirement for quantity of detail in the prints decreases, as the quantity of details decrease For clearer prints, fewer details are needed and for less clear prints, more details are needed QQ Threshold Curve One unit of uniqueness in agreement is the theoretical minimum needed to determine the prints had been made by the same unique and persistent source QQ Threshold Curve Agreement (white area): sufficient detail agree and support a determination that the prints came from the same source Disagreement (white area) sufficient details disagree and warrant a determination that the prints came from different sources Inconclusive (gray and black areas): the examiner cannot determine whether the details actually agree or disagree or cannot determine sufficiency of sequences and configurations APPLICATION OF SPATIAL STATISTICS TO LATENT PRINT IDENTIFICATIONS Methodology •Ten-print cards - Qualitative image assessment •Scan, segregate and image enhancement •Orientation, ULW minutiae detection, mark core and delta •Geo-referencing and image QC •GIS data conversion •Spatial analysis of ridge lines and minutiae •Statistical analyses and probability modeling Extraction Software Free Fingerprint Imaging Software -- fingerprint pattern classification, minutiae detection, Wavelet Scalar Quantization(wsq) compression, ANSI/NIST-ITL 1-2000 reference implementation, baseline and lossless jpeg, image utilities, math and neural net libs Universal Latent Workstation (ULW) -- interoperable and interactive software for latent print examiners. The software improves the exchange and search of latent friction ridge images involving various Automated Fingerprint Identification Systems. Distribution of Minutiae Geometric Morphometric Analysis Research on fingerprints traditionally done using biometrics, which analyze linear geometric properties but ignore underlying biological properties Ignoring these may exclude important bio patterns Biomathematics include inherent biological properties of features GM is a biomathematical model that includes biometrics, along with other fields for a comprehensive analysis GM Analysis Used for mandibular morphology, craniofacial features, identification using sinus cavities, pediatric skeletal age For this project, GM used to study shape variation of four pattern types: left and right loops, whorls, and double loop whorls GIS used for efficiency Tasks: Establish Methodology. Begin Analysis. Method: Landmark and Semilandmark Designation and Acquisition 30 images each referenced with arcGIS to find core and align in coordinate space Landmarks – Core, aspects of the delta Semi-landmarks – Points along a ridgeline For loops the core was defined as the point along the innermost ridgeline that forms the first full loop where the tangential angle is closest to 0 degrees For whorls and double loop whorls, core defined as ridge ending in the middle Method: Landmark and Semilandmark Designation and Acquisition Delta defined as a triradius consisting of 3 ridge systems converging with each other at an angle ~ 120 degrees A equilateral triangle, sized as small as possible, placed manually to define the delta. 100% consensus among team required Method: Landmark and Semilandmark Designation and Acquisition Core and vertices of triangles defined as landmarks For loops: Radial line template of seven lines, eighteen degrees apart. Intersections of lines and first continuous ridgeline are semi-landmarks Method: Landmark and Semi-landmark Designation and Acquisition For loops: Two reference lines, one vertical, going through core; one horizontal from lowermost vertex to vertical line Ten equidistant lines drawn from core to horizontal line Where top six lines intersect with ridgeline that the core is on are more landmarks Method: Landmark and Semi-landmark Designation and Acquisition For whorls: Line template constructed with thirteen lines, nine degrees apart Intersection of lines with first continuous ridgeline were landmarks After defining landmarks and semi-landmarks, GIS used to record the features for all 120 prints Method: Generalized Procrusted Analysis Landmark and semi-landmark coordinates superimposed into a coordinate system in order to conduct statistical analysis Calculated Procrustes mean shape values Method: Generalized Procrusted Analysis RSL and LSL, W and DLW superimposed onto each other with geometric transformations to determine variance Method: Thin-Plate Spline Procrustes mean shape values analyzed using R statistical software to produce TPS deformation grids Method: Thin-Plate Spline TPS grids provide a smooth interpolation of interlandmark space and provide exact mapping for landmarks and semi-landmarks from one pattern type onto another Method: Principle Component Analysis Captured a percentage of total variation based on distribution to summarize original larger data set Direction of relative displacement for each landmark determined Results: Generalized Procrustes Analysis LSL: semi-landmarks were tightly clustered around mean shape showing little shape variation for both core ridgeline and continuous ridgeline. Large dispersion of delta landmarks and crease landmark Whorls: Continuous ridgeline showed little shape variation. Delta and crease landmarks showed significant variation LSL-RSL: greater dispersion due to size variation and rotational effects W-DLW: same as LSL-RSL Results: Thin-Plate Spline The greater the deformation in the grid, the more shape variation between the two RSL-LSL: high degree of shape consistency with greatest variation in the delta region W-DLW: same as RSL-LSL Results: Principle component analysis Calculations used to reduce total of landmarks and semilandmarks to one set to summarize degree of shape variation in each pattern type Direction of variation represented by vector line Degree of variation indicated by amount of deformation in grid RSL-LSL: different directions of variation, greatest variation in delta regions W-DLW: greatest variation in delta regions False-Match Probabilites and Monte Carlo Analyses “A computer algorithm used to repeatedly resmaple data from a given population to make inferences about stochastic processes” Ideal for rare events, hard to analyze rare events with other methods Goal is to produce an expected result, E(X) where X is a random variable. MC sim creates n independent samples of X, and as n increases, the average of the samples converges to the expected result False-Match Probabilites and Monte Carlo Analyses Used for village placement to avoid natural disasters, species diversity, evolution, air traffic control For this project: There is biological ground to believe that fingerprints are unique, but statistics allows for duplicates Uniqueness not in question, but partial uniqueness is possible. Since examined prints are rarely full, need to see chances of partial duplicates False-Match Probabilites and Monte Carlo Analyses Methods and background are numerically and theoretically intensive, so will email paper to those more interested No assumptions – works well for small sample sizes, but assumptions must be used for larger numbers Compared different sample sets to determine probability of a false-match 1200 fingerprints False-Match Probabilites and Monte Carlo Analyses GIS Standardize coordinate space and analyze print by section Eight simulations to determine how each attribute affects false-match probabilities Nine overlapping grid cells and total minutiae in each cell counted Sets of three, five, seven, or nine minutiae selected False-Match Probabilites and Monte Carlo Analyses Minutiae selected without replacement 50 prints selected for LSL, RSL, W, DLW 20 prints selected for arches, 25 for tented arches Simulations iterated 1000 times Comparisons across and within pattern types Needed to account for variance of each minutiae Bifurcation angles, ridge ending roundness, etc. MC Results Similar probability results for all pattern types As robustness of simulation expanded, probability of false match decreased greatly Using all criteria with location, three minutiae has a falsematch chance of 1 in 5 million Using only location, chance is 1 in 1600 Using only location with 5 minutiae, chance is 1 in 125000 Only one false match found when considering position of 9 minutiae MC Results Highest false match probability in regions below core and near delta (more minutiae) Regions above core have very low false match probability (less minutiae) Most matches found using Monte Carlo are obviously not matches when examined Similar patterns of minutiae, but not type were found Small sample size limits conclusions 100,000 fingerprints considered desirable for strong results (6-7 weeks of computer time)