Statistics Presentation

advertisement
Statistics of Fingerprints
Dakota Boyd, Dustin Short, Elizabeth Lee, John
Huppenthal, Shelby Proft, Wacey Teller
History of
Fingerprinting
Originally used paper and ink fingerprints
Fingerprints were matched using trained individuals
Initially, each country has its own standards
Digital fingerprinting lead to international standards
Fingerprints can now be matched or partially
matched using algorithms
Section 6.1-6.3 from The Fingerprint Sourcebook
Problems with Automated
Fingerprint Processing Systems
Digital Fingerprint acquisition
Image enhancement
Feature/Minutiae extraction
Matching
Indexing/retrieval
Section 6.4.1 from The Fingerprint Sourcebook
Fingerprint Acquisition
Ink and paper method
Latent prints
Livescan images – fingerprint
sensors
FTIR optical scanner
Capacitive scanner
Piezoelectric scanner
Thermal scanner
Figure 6-6 from The Fingerprint Sourcebook
Image Enhancement
Many acquisition types
leads to many noise
characteristics
Enhancement algorithms
help correct unwanted
noise
Latent Print Enhancement
Automated Enhancement
Figure 6-10 from The Fingerprint Sourcebook
Feature Extraction
Binarization algorithm – Black is ridges, white is
valleys
Thinning algorithm leads to the thinned image or
skeletal image
Minutiae detection algorithm locates the x, y, and
theta coordinates of the minutiae points
Minutiae post processing algorithm to detect
false minutiae
Section 6.4.4 from The Fingerprint Sourcebook
Matching
Factors that influence matching from fingerprint
acquisition: displacement, rotation, partial overlap,
nonlinear distortion, pressure, skin conditions, noise
from imaging, errors from feature extraction
First need to establish alignment
Programs may use core and delta points to align
fingerprints
Could use Hough transform
Then match minutiae
Fingerprint is then given a matching score
High = high probability fingerprint are a match
Low = low probability the fingerprints are a match
Section 6.4.5 from The Fingerprint Sourcebook
Indexing Fingerprints
Need to be able to index and retrieve
fingerprints of a given individual
Before digital fingerprints, forensic experts used
filing cabinets to organize prints using a
classification system
Prints are explicitly classified by overall shape:
right loop, left loop, whorl, arch, tented arch, and
double loop
Can be continuously classified using vectors
Section 6.4.6 from The Fingerprint Sourcebook
The Galton Model
First probability model for fingerprint individuality (1892).
Variously sized square papers dropped over sections of a
fingerprint, and a prediction of whether or not the paper
cover minutiae.
Model not based on actual distribution or frequency of
minutiae.
Estimated probability of different pattern types present
and the number of ridges in the selected region of the
print.
Probability of finding any given minutiae in a fingerprint
given as 1 in 68 billion.
The Osterburg Model
1977-1980
Divide fingerprint into 1 sq. mm sections and count the
occurrence of 13 different minutiae appearances in each
section.
Rarity of a fingerprint arrangement = product of all
individual minutiae frequencies and empty cells.
Example: 72 sq mm fingerprint, 12 ridge endings, each in
one cell, 60 empty cells, probability = (0.766)60 (0.0832)12 =
1.25 x 10-20. 0.766 and 0.0832 are Osterburg’s observed
frequencies of an empty cell and a ridge ending.
Problem: This model assumes each cell/section event is
independent.
The Stoney and
Thornton Model
1985-1989
Determined criteria for an ideal
model to calculate individuality
of a fingerprint and the
probabilistic strength of a
match.
Each minutiae pair is described
by the six characteristics and
the spatial position of the pair
within the entire fingerprint.
Classifying Characteristics
Ridge structure and description of
minutiae locations.
Descriptions of minutia
distribution.
Orientation of minutiae.
Variation in minutiae types.
Variation among prints from the
same source.
Number of orientations and
comparisons.
The Pankanti, Prabhakar,
and Jain Model
2001
Model assesses probabilities of false matches, not
individuality of fingerprints.
Calculates the number of possible arrangements of ridge
endings and bifurcations.
Calculated spatial differences of minutiae in pairs, and
accept similar spatial calculations as matches. (x, y, θ).
Each fingerprint had four captures, separated in two
databases, to determine an acceptable tolerance of error
based on natural variations.
First Level Detail
Direction of ridge flow in the print.
Not necessarily defined to a specified fingerprint
pattern.
General direction of ridge flow is not unique.
Second Level Detail
Pathway of specific ridges.
Includes starting position, path of the ridge, length,
and where the ridge path stops.
Includes configurations with other ridge paths.
Uniqueness is found with the ridge path, length, and
terminations.
A general direction must exist (first level detail).
Third Level Detail
Shapes of the ridge structures.
Morphology of the ridge: edges, textures, and pore
positions on the ridge.
Shapes, sequences, and configurations of third level
detail are unique.
General direction (first level) and a specific ridge
path (second level) must exist for third level detail.
Persistence
Comparing the visibility of minutiae in fingerprints
over a time span.
Galton found one discrepancy, where a single
bifurcation was not present 13 years later.
Other studies with age spans ranging up to 57 years
found no discrepancies of minutiae.
All in first and second level detail.
Persistence
Pores on the ridges of friction ridge skin remain
unchanged throughout life. Their location remains
the same.
Palm creases (third level detail of the palm) have
seen changes over long time periods.
Due to age of the skin, skin flexibility, and other
factors.
All in third level detail.
Persistence
Basal layer (regenerative layer between dermis and
epidermis).
Friction ridge skin persistency is maintained by the
regenerative cells in the stratum basale, and the
connective relationship of these cells.
Examination Method
Analysis, comparison, evaluation (ACE) and
verification (V)
This is one description of a method of comparing
details, forming a hypothesis about the source,
experimenting to determine whether there is
agreement or disagreement, analyzing the
sufficiency of agreement or disagreement, rendering
an evaluation, and retesting to determine whether
the conclusion can be repeated.
Examination Method
Analysis
The assessment of a print as it appears on the
substrate.
Makes the decision of whether the print is sufficient
for comparison with another print
Looks at the substrate, matrix, development
medium, deposition pressure, pressure and motion
distortion, and development medium for appearance
and distortion
Examination Method
Comparison
Determine whether the details in two prints are in
agreement based upon similarity, sequence, and
spatial relationship occurs in the comparison phase
Because no print is ever perfectly replicated, mental
comparative assessment consider tolerance for
variations in appearance caused by distortion
Makes comparative measurements of first, second,
and third level details are made along with
comparisons of the sequences and configuration of
ridge paths
Examination Method
Evaluation
The formulation of a conclusion base upon analysis
and comparison of friction ridge skin
The examiner makes the final determination as to
whether a finding of individuation or same source of
origin can be made
Makes comparative measurements of first, second,
and third level details are made along with
comparisons of the sequences and configuration of
ridge paths
Examination Method
Recurring, Reversing, and
Blending Application of ACE
The examiner can change the phase of the
examination often re-analysis, re-compares, and reevaluates.
There is no clear linear path to this ACE process
because the decision of choosing whether the two
fingerprints are the same complicates things.
Examination Method
Because of the ambiguity of the process the colored
diagram is used to illustrate the process.
The critical application of
ACE is represented in the
model by red area A, green
area C and blue area E
The actual examination is
represented in the model by
threee smaller circles with
capital A, C, and E.
Examination Method
The black dot in the center represents the
subconscious processing of detail in which
perception can occur
The gray represents other
expert knowledge, beliefs,
biases, influences and
abilities.
The white that encircles the
grey represents the decision
has be made
Many evaluation take place.
Eventually the final analysis
and comparison lead to the
final evaluation
Examination Method
Verification
The independent examination by another qualified
examiner resulting in the same conclusion
It is another person going through the ACE process
of verifying if the two prints conclusion are the same
The verifier must not know the decision of the
previous conclusion to get decisions that is
nonbiased
Decision Thresholds
Decisions must be made within each phase of ACE
whether to go foreword, backwards, or to stop in the
examination process must be decided
History of threshold:
New Scotland Yard adopted a policy (with some
exceptions) of requiring 16 points
The FBI abandoned the practice of requiring a set
number of points
The IAI (International Association for Identification)
formed a committee to determine the minimum
number of friction ridge characteristics which must be
present in two impressions in order to establish
positive identification
Decision Thresholds
The prevailing threshold of sufficiency is the
examiners determination that sufficient quantity and
quality of detail exists in the prints being compared
Quantitative-qualitative threshold (QQ)
For impressions from volar skin, as the quality of
details in the prints in creases, the requirement for
quantity of detail in the prints decreases, as the
quantity of details decrease
For clearer prints, fewer details are needed and for
less clear prints, more details are needed
QQ Threshold Curve
One unit of uniqueness in agreement is the
theoretical minimum needed to determine the prints
had been made by the same unique and persistent
source
QQ Threshold Curve
Agreement (white area): sufficient detail agree and
support a determination that the prints came from the
same source
Disagreement (white area) sufficient details disagree and
warrant a determination that the prints came from
different sources
Inconclusive (gray and
black areas): the
examiner cannot
determine whether the
details actually agree
or disagree or cannot
determine sufficiency
of sequences and
configurations
APPLICATION OF SPATIAL
STATISTICS TO LATENT
PRINT IDENTIFICATIONS
Methodology
•Ten-print cards - Qualitative image assessment
•Scan, segregate and image enhancement
•Orientation, ULW minutiae detection, mark core and delta
•Geo-referencing and image QC
•GIS data conversion
•Spatial analysis of ridge lines and minutiae
•Statistical analyses and probability modeling
Extraction Software
Free Fingerprint Imaging Software -- fingerprint
pattern classification, minutiae detection, Wavelet
Scalar Quantization(wsq) compression, ANSI/NIST-ITL
1-2000 reference implementation, baseline and
lossless jpeg, image utilities, math and neural net libs
Universal Latent Workstation (ULW) -- interoperable
and interactive software for latent print examiners.
The software improves the exchange and search of
latent friction ridge images involving various
Automated Fingerprint Identification Systems.
Distribution of Minutiae
Geometric
Morphometric Analysis
Research on fingerprints traditionally done using
biometrics, which analyze linear geometric properties but
ignore underlying biological properties
Ignoring these may exclude important bio patterns
Biomathematics include inherent biological properties of
features
GM is a biomathematical model that includes biometrics,
along with other fields for a comprehensive analysis
GM Analysis
Used for mandibular morphology, craniofacial
features, identification using sinus cavities, pediatric
skeletal age
For this project, GM used to study shape variation of
four pattern types: left and right loops, whorls, and
double loop whorls
GIS used for efficiency
Tasks: Establish Methodology. Begin Analysis.
Method: Landmark and Semilandmark Designation and
Acquisition
30 images each referenced with arcGIS to find core and
align in coordinate space
Landmarks – Core, aspects of the delta
Semi-landmarks – Points along a ridgeline
For loops the core was defined as the point along the
innermost ridgeline that forms the first full loop where
the tangential angle is closest to 0 degrees
For whorls and double loop whorls, core defined as ridge
ending in the middle
Method: Landmark and Semilandmark Designation and
Acquisition
Delta defined as a triradius
consisting of 3 ridge systems
converging with each other at an
angle ~ 120 degrees
A equilateral triangle, sized as small
as possible, placed manually to
define the delta. 100% consensus
among team required
Method: Landmark and Semilandmark Designation and
Acquisition
Core and vertices of
triangles defined as
landmarks
For loops:
Radial line template of
seven lines, eighteen
degrees apart.
Intersections of lines and
first continuous ridgeline
are semi-landmarks
Method:
Landmark and
Semi-landmark
Designation and
Acquisition
For loops:
Two reference lines, one vertical, going through core;
one horizontal from lowermost vertex to vertical line
Ten equidistant lines drawn from core to horizontal line
Where top six lines intersect with ridgeline that the
core is on are more landmarks
Method: Landmark
and Semi-landmark
Designation and
Acquisition
For whorls:
Line template constructed with thirteen lines, nine
degrees apart
Intersection of lines with first continuous ridgeline were
landmarks
After defining landmarks and semi-landmarks, GIS used
to record the features for all 120 prints
Method: Generalized Procrusted
Analysis
Landmark and semi-landmark coordinates
superimposed into a coordinate system in order to
conduct statistical analysis
Calculated Procrustes mean shape values
Method: Generalized Procrusted
Analysis
RSL and LSL, W and DLW superimposed onto each
other with geometric transformations to determine
variance
Method: Thin-Plate
Spline
Procrustes mean shape values analyzed using R
statistical software to produce TPS deformation
grids
Method: Thin-Plate
Spline
TPS grids provide a smooth interpolation of interlandmark space and provide exact mapping for
landmarks and semi-landmarks from one pattern
type onto another
Method: Principle
Component Analysis
Captured a percentage of total variation based on
distribution to summarize original larger data set
Direction of relative displacement for each landmark
determined
Results: Generalized
Procrustes Analysis
LSL: semi-landmarks were tightly clustered around mean
shape showing little shape variation for both core
ridgeline and continuous ridgeline. Large dispersion of
delta landmarks and crease landmark
Whorls: Continuous ridgeline showed little shape
variation. Delta and crease landmarks showed significant
variation
LSL-RSL: greater dispersion due to size variation and
rotational effects
W-DLW: same as LSL-RSL
Results: Thin-Plate
Spline
The greater the deformation in the grid, the more
shape variation between the two
RSL-LSL: high degree of shape consistency with
greatest variation in the delta region
W-DLW: same as RSL-LSL
Results: Principle
component analysis
Calculations used to reduce total of landmarks and semilandmarks to one set to summarize degree of shape
variation in each pattern type
Direction of variation represented by vector line
Degree of variation indicated by amount of deformation
in grid
RSL-LSL: different directions of variation, greatest
variation in delta regions
W-DLW: greatest variation in delta regions
False-Match Probabilites
and Monte Carlo Analyses
“A computer algorithm used to repeatedly resmaple
data from a given population to make inferences
about stochastic processes”
Ideal for rare events, hard to analyze rare events with
other methods
Goal is to produce an expected result, E(X) where X is
a random variable. MC sim creates n independent
samples of X, and as n increases, the average of the
samples converges to the expected result
False-Match Probabilites
and Monte Carlo Analyses
Used for village placement to avoid natural disasters,
species diversity, evolution, air traffic control
For this project: There is biological ground to believe
that fingerprints are unique, but statistics allows for
duplicates
Uniqueness not in question, but partial uniqueness is
possible. Since examined prints are rarely full, need
to see chances of partial duplicates
False-Match Probabilites
and Monte Carlo Analyses
Methods and background are numerically and
theoretically intensive, so will email paper to those
more interested
No assumptions – works well for small sample sizes,
but assumptions must be used for larger numbers
Compared different sample sets to determine
probability of a false-match
1200 fingerprints
False-Match Probabilites
and Monte Carlo Analyses
GIS
Standardize coordinate space and analyze print by
section
Eight simulations to determine how each attribute
affects false-match probabilities
Nine overlapping grid cells and total minutiae in each
cell counted
Sets of three, five, seven, or nine minutiae selected
False-Match Probabilites
and Monte Carlo Analyses
Minutiae selected without replacement
50 prints selected for LSL, RSL, W, DLW
20 prints selected for arches, 25 for tented arches
Simulations iterated 1000 times
Comparisons across and within pattern types
Needed to account for variance of each minutiae
Bifurcation angles, ridge ending roundness, etc.
MC Results
Similar probability results for all pattern types
As robustness of simulation expanded, probability of false
match decreased greatly
Using all criteria with location, three minutiae has a falsematch chance of 1 in 5 million
Using only location, chance is 1 in 1600
Using only location with 5 minutiae, chance is 1 in 125000
Only one false match found when considering position of
9 minutiae
MC Results
Highest false match probability in regions below core and near
delta (more minutiae)
Regions above core have very low false match probability (less
minutiae)
Most matches found using Monte Carlo are obviously not
matches when examined
Similar patterns of minutiae, but not type were found
Small sample size limits conclusions
100,000 fingerprints considered desirable for strong results (6-7
weeks of computer time)
Download