Given the possible number of genetic variations, the probabilit

Summary Given the possible number of genetic variations, the probability of having a naturally occurring Doppelganger is low. This is why DNA evidence acquired at crime scenes is such conclusive evidence when presented in criminal trials. Though the process of DNA fingerprinting is fallible, the probability that two unrelated people with the same DNA exist is microscopic. Barring, then, that you have an identical evil twin, the probability that you will be mistaken for a criminal based on such evidence is low. Fingerprints, however, being only a portion of this genetic identity, seem far less restricting. It is then conceivably possible that one could be mistaken as the perpetrator of a crime based on fingerprint evidence. It is our goal to determine exactly how probable this is. One of the progenitors of the study of fingerprint identity was Sir Francis Galton, who identified characteristic ridge patterns in the skin that vary widely among a population, but which are constant over time to an individual. In addition to these minutiae, fingerprints also have an overall pattern that in nearly all cases falls into one of three groups: loops, arches, and whorls. Using both the overall fingerprint patterns, and a set of the most commonly occurring Galton Characteristics (GCs), we created a model to test the individuality of fingerprints, based on a probabilistic interpretation: highly probable fingerprints are less individual, and less probably fingerprints are more individual. In this model, we first divided an ideal rectangular thumbprint into squares of equal area, denoted as cells. Knowing that any comparison between two fingerprints first matches the general pattern of a fingerprint and then a certain number of GCs, we calculated the fingerprint patterns that have the maximum probability of occurrence. This was done by using figures which determined the relative frequency of occurrence of each of the patterns and GCs. To start, we assumed that from an ideal thumbprint containing N total cells, we chose to confirm the form and placement of n GCs in those cells. Our model proceeds in stages, first choosing the overall pattern of the print, and then proceeding to choose n locations of GCs from the N total placements possible. Once the pattern and placement have been determined, it remains only to factor in the relative occurrence probabilities of each GC in order to determine a measure of the individuality of the fingerprint. The model is constructed based on a number of assumptions. To begin with, we first assume that the patterns and GCs occur independently; neither has an influence on the other’s probability. In later stages of our analysis, then, we account for the fact that dependencies may exist, and alter the selection of GCs accordingly. Another assumption that our model makes is that the GCs occur independently; that is, in the n spaces which we wish to confirm the presence of GCs, placement has no effect on which characteristic is selected. Since there has been no conclusive evidence that a particular fingerprint pattern has any influence on the minutiae present in the fingerprint, this seems to be a valid assumption, and hence no unnecessary restrictions were placed on the form of the fingerprint. The construction of the model allowed us to calculate the ability to confirm a fingerprint based on partial fingerprint evidence. In addition, we used population figures of many countries and the entire world to find what the minimum number of GCs in common between fingerprints should be before a match can be said to occur. In testing this model, we did not calculate the probability of occurrence for every individual pattern and placement of GCs. Rather, we calculated only the probability of the most likely occurrence. Also, the orientation of GCs was not taken into consideration. This may at first seem to be a weakness, but is in fact a strength, as requiring a fingerprint to occur with GCs oriented in a particular direction is stricter than not requiring any particular direction for their placement. Thus, any fingerprint occurring in nature is hypothetically less likely to occur than our calculated maximum. For a template fingerprint with 12 identified minutiae, a reasonable required number given new advancements in laser recognition of fingerprints, the probability finding a match was calculated to be on the order of 10-13. This figure shows that even the most likely fingerprint is thus highly individual, and fingerprint identification is as reliable on ideal grounds as DNA identification, which has reliability on the order of 10-10. Team 250 Page 2 AN INQUIRY INTO INDIVIDUALITY OF THUMBPRINTS Asma Al-Rawi, Steve Gilberston, Jonathan Whitmer Kansas State University Mathematical Contest in Modeling 2004 I. Introduction “How can you disbelieve in me when I have created each one of you down to the prints on your fingers?” --God (The Holy Qur’an 75:3-4) [4] The above reference, depending on one’s religiousness or secularism, either confirms that fingerprints are distinct to individuals, or at the very least, that knowledge of variation of fingerprints between persons, and its inherent properties in identification, has existed since the 8th century. In modern Western culture, the idea of using fingerprints as a means of identification first appeared in an article written by Henry Faulds in 1880 in the journal Nature [3]. His interest was aroused by his discovery of ridged pattern imprints in handmade pottery. After performing a series of experiments to determine difference in fingerprints among individuals as well as their resilience, he recommended that a primary use of these ridged imprints could be used as evidence of criminal identity at the scene of the crime. At the root of this assertion is the assumption of uniqueness in each human’s fingerprint patterns. There are several commonalities in the patterns of ridged skin, however, which allow fingerprints to be systematically classified. For example, the ridged lines on fingers appear in a number of major pattern types: loops, which comprise the largest portion of all fingerprints and occur in two chiralities; whorls, which are characterized by the spiraling pattern of the ridges; and the arches, which comprise the smallest major group [1]. Other possible manifestations exist; however their occurrence is very rare. In addition to these major groups, the ridges of different fingerprints show certain defining characteristics. This idea was prevalent in one of the first attempted quantifications of fingerprint individuality, which was performed by Sir Francis Galton in 1892 [1]. The patterns of finger ridge divergences and combinations, termed minutiae, are also identified as Galton Characteristics in his honor. Later developments have incorporated his ideas along with other print-determining factors to establish more exactly each print’s uniqueness [1,2,6]. Whether or not each fingerprint pattern is truly unique, their use as a form of identification has found much use in forensic science. Recently, however, the validity of fingerprint evidence has been called into question, as evidenced by the case United States v. Mitchell, which presented the US with its first challenge as to the admissibility of latent fingerprint evidence as a means of identification [7]. This necessitates a reevaluation of the validity of fingerprint uniqueness in measurement. Thus, we become faced with the problem of determining the probability that two people in the world might share the same fingerprints to measurable accuracy. This is quite a complex problem if one allows it to be, as there seem at first to be almost infinitely many variations within ridge patterns whose appearance and interplay must be accounted for, and yet it has a simple and elegant solution which we will show in this paper. In our study, we focus not Team 250 Page 3 on each of the ten fingers, but on only the thumb, which effectively serves as an upper bound for the multiple occurrence probability of all friction ridged skin. Our calculations have found on the basis of a discrete probability model that it is extremely unlikely that two people with the same thumbprints have ever existed, within the limitations of current measurement practices. II. Model The first step in devising a model for thumbprint individuality is simply to understand what types of fingerprints exist. As mentioned previously, fingerprints occur in what seems to be an infinite number of variations, determined by both their overall pattern and the distribution of Galton Characteristics (GCs). The patterns fall into three main categories: loops, arches, and whorls. These can be further divided into over a thousand subcategories [1]. Figure 1 shows the major types of prints. FIGURE 1. These are four most common patterns of fingerprint patterns: Left and right loops, whorls, and arches. From www.sfis.ca.gov/pattern_types.htm. Prints which fall into these categories can, to the untrained eye, and oftentimes even the trained eye, appear very similar. When the contribution of GCs is factored in, a particular fingerprint’s unique character starts to become apparent. The major types of GCs are illustrated in Figure 2. Whether the pattern on the finger is a loop, arch, or whorl, GCs occur randomly throughout the entire print. These occurrences give distinct attributes to the print that can be systematically classified. Team 250 Page 4 FIGURE 2. A chart showing the 10 most common forms of Galton Characteristics. (Osterburg ??) The central problem, given a known classification of a fingerprint by its pattern and GCs, becomes to calculate the probability that an identical finger exists. Our model focuses specifically on thumbprints, for a variety of reasons. For instance, a thumb is easy to idealize. In practice, when fingerprints are taken, the finger is rolled over nearly its entire surface above the first knuckle. This is similar to the unrolling of an uncapped cylinder. The shape of this print on paper is approximately rectangular. The thumbprint has the largest area, and also the largest number of defining qualities, due to the random distribution of GCs. For an ideal rectangular thumbprint, we partition the area into N equally sized squares, with a minimum size on the order of one square millimeter, due to the minimum extent to which a GC can be identified as occurring in one of the N squares. Since only a finite number of visible GCs can occur on a single patterned finger, a discrete probability method is useful for determining the possibility of Doppelganger thumbs. It is then perfectly admissible to use a counting argument to find approximately the number of possible arrangements of friction ridges on the thumb, and their relative occurrences based on the features they contain. It should be noted that ideal fingerprints as described above do not usually occur in actual fieldwork. Usually only portions of fingerprints are left by oils or other substances on the fingers of the criminal; these are called latent prints. After these latent prints are developed and brought into visible form, they are described as partial prints. These partial prints contain only a fraction of the total surface of the friction ridged skin on the thumb. Using similar ideas to the ones above, we can model partial prints simply Team 250 Page 5 by decreasing N; that is, limiting the number of cells on which the prints have to match up. Since a partial print cannot possibly match the rest of the cells contained in an ideal print, the characteristics of those cells are irrelevant. Decreasing N then gives an accurate model, as we can say that the area we are sampling from is smaller. Accordingly, the probability of matching the print among people of a given population grows, as we show below. III. Probability Algorithms Our first step was to measure the dimensions of an idealized thumb. Averaging over the three members in our group, we found the dimensions of a nearly rectangular print, when measured as described above, to be approximately 3 cm by 4 cm. Thus there are approximately 1200 square millimeters on two thumbs. We took each square millimeter to be a cell, so that in our ideal thumb model, a full print has a possibility of 1200 identification points. In practice, a suspect’s thumbprint and the thumbprint found at the scene of the crime are compared to each other on both the overall pattern and a certain number of distinguishing characteristics. The distinguishing factors can correspond to either scars on the suspect’s thumbprint or GCs. Since scars are the result of completely random events, and thus are nearly impossible to quantify without exact personal histories, our model considers only the cases in which GCs occupy these identifying points. In previous models [1,2], the relation between GCs and the overall pattern was not considered; only the occurrence of GCs was taken into account. In our model, various degrees of pattern and GC independence were considered. This accounts for the possibility that a certain percentage of the GCs are inherent in the overall pattern. In the case where pattern and GC occurrences are completely independent, one can separate the probability of a fingerprint’s occurrence into two factors: Pfp PpP GC (1). In the above equation, Pfp is the probability a particular fingerprint will occur, Pp is the probability a particular pattern will occur, some approximate figures for which are given in Table 1, and PGC is the probability of a particular combination of GCs. Class of Print Right Loop Left Loop Whorl Arch Total Probability 0.325 0.325 0.3 0.05 1 TABLE 1: A list of approximate occurrence probabilities of the four most common thumbprints from Osterburg, et. al. The loop category is determined therein to have a 65% occurrence probability, which here is divided into the two chiralities, which are easily distinguishable and occur at nearly the same rate overall. Our model treats non-measured GCs and cells in which there are no GCs as equivalent empty cells. Thus, in the case where GCs are dependent on which pattern a fingerprint has, we can still use this independence model, by noting that since a particular Team 250 Page 6 percentage of the GCs are determined by the pattern, we can treat those as empty space in which no defining characteristic occurs. Suppose then, that we wish to find the probability that a particular distribution of measured GCs occurs. To do this, we note that of the N total cells in the fingerprint, only n of these cells have any significance in terms of GC measurement. The number of ways this can be distributed is easy to compute. Placing all measured cells on the same level, we begin placing GC’s and empty cells on the surface of the thumbprint. At first there are n GCs to place within the total area of the print, and N total cells to place them in. If the first cell is empty space, we are left with N-1 cells in which to place characteristics, and n characteristics. If the first cell contains a characteristic, we have N-1 empty cells in which to place characteristics, and n-1 GCs. Iterating this choice process over all N cells, we find that the number of ways we can place the GCs is N n N! n!( N n)! (2). This leaves us to calculate the probability that each GC cell contains a particular GC. Osterburg, et al, contains relative frequencies of occurrence for each characteristic averaged over 39 fingers. Table 2 gives these figures. In our model, since we disregard empty spaces, we considered only the relative frequency of the eleven most common elements. Double occurrences, or the event that two GCs occur in the same space, while certainly possible, were ignored in this model calculation, due to their small frequency. The number in the table is misleading, as it accounts for all double occurrences, not double occurrences of particular types. Parameter 0 1 2 3 4 5 6 7 8 9 10 11 12 Cell configuration Empty Island Bridge Spur Dot Ending ridge Fork Lake Trifurcation Double bifurcation Delta Broken ridge Multiple occurances Total Frequency 6,584 152 105 64 130 715 328 55 5 12 17 119 305 8,591 Probability of Parameter 0.766 0.018 0.012 0.007 0.015 0.083 0.038 0.006 0.001 0.001 0.002 0.014 0.036 1.000 TABLE 2. Experimentally determined Galton Characteristic probability numbers. From Osterburg, et al. Our model disregards multiple occurrences, hence for our purposes, the characteristics numbered 0 and 12 are empty cells. Only the characteristics numbered 1-11 are relevant. The relative probability is a necessary factor for determining which characteristic is most likely to occur in the n GC cells. The probability of the ith occurrence is given by: Team 250 Page 7 P (i ) P (i) ri (3), i where the elements P(i) are determined from Table 1. The i in this case ranges from 1 to 11, as our model considers only single GC occurrences, and treats the low probability and multiple occurrence GCs as empty space. It should be noted that their inclusion would decrease the relative probability of the ith term as defined above; hence, it would decrease the upper bound which our calculation aims to set. Clearly, the sum of these relative probability quantities is 1, hence they are validly defined as probabilities. For n GCs, the probability of each arrangement is given by the relative probability of each GC to the power of the number of times the GC is selected divided by the number of ways to divide those n elements into groups categorized by the eleven GCs considered. Though the idea is complex, the notation is rather mathematically simple, and corresponds to the product of the selection probabilities divided by the multinomial coefficient corresponding to n choosing n1 of GC number 1, n2 of GC number 2, etc. If we divide this quantity by the number of ways each of the n GCs considered, we obtain the probability of each arrangement of n GC’s, shown in equation (4a). 11 i1 PGC 11 ri i n n1 n11 i1 N n n! n1! n11! 11 ri (ni !)ri i i N! n!( N n)! N! ( N n)! i (4a) One should note that in the above, i i n (4b), hence there are only as many stages considered in the determination of GCs as there are GCs that are measured and available to compare to. To reiterate, our algorithm for calculating Doppelganger thumb probabilities considers separately the probabilities of both the general pattern and GC occurrence. The probability of GC occurrence is determined by the number of places in which GCs are observed, the relative probability of a GC occurring there, and the number of ways these GC’s can then be ordered. The quantification of this is then given by equation (4a). Now, given equations (1) and (4a), we can calculate the probability of any particular fingerprint matching on both the pattern and any n GCs by using the information in Tables 1 and 2. Since we wish, then, to put a limit on the number of people in the world who can match fingerprints, given these characteristics, we calculated Pmax, the probability of any thumbprint matching a template with only the most likely characteristics in each of the GC places. This simplifies equation (4a), by restricting choice to only the GC with maximum probability. Thus we have Team 250 Page 8 11 i1 PGC ri i n n1 n11 N n n rmax n n 0 N 0 n n rmax N n Pmax (5). Some plots of this are given in Appendix A. These plots use the value of rmax obtained by computing the relative probability of ending ridges, and consider only the right and left loop patterns (occurring in equal supply) to constitute the maximum pattern probability. To calculate the quantities determined in equation (5), it becomes necessary to calculate factorials of very large numbers to determine values of N choose n. This can be approximately done by using Sterling’s approximation, whose formula is given by log(m!) m log(m ) m 1 log(2 m) 2 (6). n)! (7), This, in turn, leads us to the approximation log N n log(n!) log (N which can be utilized to approximate log(n!) N . n If we suppose that a percentage of GCs are dependent on the overlying pattern, then our model changes very little. Assuming that l of the n total GCs are dependent on a particular pattern, we can essentially disregard all pattern-dependent GCs as empty cells, as they would be exactly what is expected in the print at that point in the pattern. Hence, with a slight modification from n to n – l, where l denotes the number of GCs dependent on the pattern, equations (4a) and (4b) can still be utilized. In the event that the GCs are wholly determined by the overlying pattern, we can disregard the influence of the pattern in our calculation of Pfp, as we have more precise information about GC form and occurrence than we do about pattern and sub-pattern form and occurrence. Also, our estimates for the likelihood of a GC occurring at a given point in the N-square array give a more limiting maximum for the probability than do our figures on general pattern characteristics. The omission of the pattern influence on the fingerprint probability is completely valid, since total GC dependence on pattern is equivalent to total pattern dependence on GC; they simply become two different types of taxonomy. IV. Data Returning to problem now, we are specifically asked to determine what the probability is that a person can be misidentified by fingerprint evidence; that is, we are to determine the probability that two people share the same fingerprint characteristics. For a template with n GCs, we are to calculate the probability that two distinct people match the template. This is limited by the square of Pmax for a given n, which as graphed in Team 250 Page 9 Figure 3 below, is seen to be very low for all n ≥ 10. For the value of n = 12, taken in Osterburg, et al to be a median value for what is required for verification by various international law enforcement agencies, we can see that the probability of fingerprint multiplicity is 4.64 x 10-15. These calculations were simply performed using a Microsoft Excel spreadsheet and the formulas in Section III. Maximum Probabilities at Various Pattern Dependencies 1.00E+04 1.00E-01 1.00E-06 P_max 1.00E-11 1.00E-16 1.00E-21 1.00E-26 1.00E-31 0 5 10 15 20 25 30 35 Number of GCs No Dependence 25% Dependence 50% Dependence 75% Dependence 100% Dependence FIGURE 3: Plot of maximum probability as a function of the number n of GCs used in the verification process. Here n is allowed to range from 1 to 30. Another, directly applicable, and highly interesting problem is the following: What is the maximum number of GCs that a particular country’s law enforcement agencies must use in order to get the highest probability of a match using the lowest number of GCs per identification? Using population figures in Table 3, we can determine this. To do so, we multiply the population of a country by Pmax to find the number of people in a country that are probable to match a given n GC template. The results are plotted in Appendix A. The plots in Appendix A all point to near certain identification for n ≥ 12. This is true regardless of the country in which the identification is being made. In fact, using the world population figure, it is near certain that on a thumb with 1200 cells, a match is all but certain, and indeed, only one person is likely to have ever existed with such a print. Country US World China Number of people 2.925E+08 6.347E+09 1.295E+09 Team 250 Page 10 Lichtenstein # People Ever 3.284E+04 1.269E+10 Table 3: Population figures for the world and some representative countries. The number of people ever was a figure computed on the assumption that roughly twice as many people have existed in the history of humanity than exist at this particular point in time. As was noted before, however, it might be the case that a thumb with 1200 cells is overly large, or that only partial prints can be obtained for identification purposes. In this case, we restrict the number N to a number less than 1200. For the plots in Appendix B, we changed the number 1200 in our calculation to values of N = 600 and N = 300. Though this increases the probability of finding multiple matches, due to restriction in the number of sites to place n GCs. However, if as few as 12 GCs are matched, the fingerprint’s unique identity is all but assured. V. Error Analysis A previous investigation by Pankanti, et. al. included the orientation of each minutia in the model for fingerprint individuality. We neglect to include the factor of orientation of the characteristic for many reasons. Firstly, removing the factor of GC orientation can only decrease our estimate of the maximum possible thumb Doppelganger probability. Since we are attempting only to find a maximum bound for this probability, removal of a factor which can only decrease the probability of a particular print, while in the same breath unnecessarily complicates our solution, does no damage to our model. Pankanti, whom accounts for orientation in his model, arrived at a lower figure for fingerprint individuality than we did. In accounting for this orientation, however, Pankanti completely disregards the differences in minutiae, only concentrating on location and orientation of defining features in the fingerprint ridges. Some figures done on various model calculations that are included in Pankanti’s paper are listed in Table 4, in Appendix C. A second reason our model disregards orientation is that our model relies on the assumption that minutiae occur either independently or semi-independently. In accounting for orientation, we would have to take into account restrictions placed on the orientation of the GC by the overall pattern. This is simple to see: persons with loop patterns have a higher probability upward and downward pointing GCs than do persons with arches. Accounting for orientation would make the pattern and minutiae probabilities inseparable, and again harm the simplicity of our model while offering little improvement to our limiting maximum. Another unavoidable problem with our model is the roughness of pattern and GC frequencies. Unfortunately, there are no good assessments published on the percentages of the population who patterns that fall into the arch, loop, and whorl categories. The frequency of occurrence of GCs faces a similar problem. In fact, the only figures we could find were rough estimations based on a small sample of people. Osterburg, whose figures we used in this model, arrived at his probability parameters of GCs by sampling from 39 fingerprints. He did break them into a total of 8,591 cells, but as we do not know whether or not a single person is more likely to have a certain type of GC, these probabilities cannot be taken at face value [1]. Surely more recent figures on these Team 250 Page 11 parameters exist, but they again do not harm our model, only the figures which it calculates. As mentioned before, there is a possibility that there exists dependence between GCs and the overall pattern of a fingerprint. In our model, we attempted account for this by decreasing the identifying traits of a particular minutia by 25%, 50%, and 100%. For the 100%, we simply calculated the probability of a particular GC occurrence and disregarded the pattern, as either can be seen to be the determining factor of the other. This is not an exact model simply because this assumes semi-independence where complete dependence may occur. Without proper relations that give the dependence of minutiae on the overall pattern, however, we are unable to properly account for this. Inasmuch as we were able to adjust for these parameters, our model still predicts that identifying 12 or more minutiae on a print, which is well within current technology, all but assures a positive match. One who pays astute attention to our graph in Figure 3 notes that the graphs of 100% and 0% dependence are actually the closest in predicted probability. This is because removal of the pattern parameter in the calculation of Pmax only increases the overall maximum probability by an approximate factor of 10. The other figures suffer from inexactness in relating the dependence between occurrence of pattern and minutiae. In the figures for our model, we have more precise knowledge of GC occurrence than of pattern occurrence. Hence, the plots in which we require a percent dependence on pattern suffer unnecessarily from inexact data. As we are creating a somewhat idealistic model of fingerprints, scars were not taken into consideration. As can bee seen in Figure 4, scars do have an effect on the appearance of fingerprints. This may create inaccuracies; however, there is no good way to model the formation of scars, as this is completely due to personal experiences. FIGURE 4: The effect of scars on fingerprint analysis. From Cowger, p. 4. Our model also differs on one account from most other models of fingerprints. Previous articles [3] published on fingerprint analysis define fingerprints only as the portion in the general vicinity of the central pattern. Our model actually takes the print on the entire area above the upper joint of the thumb, which would be the type of fingerprint on file. Accordingly, our probabilities are significantly lower than those calculated by others. However, our model can, as mentioned before, be made to approximate these in the limit where the number of cells N is at a value around 300 and n is around 12. The values we calculated in this method match up to other models accordingly, as seen in Table 4. The major problem which our model suffers from is its inability to account for human error in determining thumbprint probability. Epstein [7] notes that the major problem with latent fingerprint evidence is the inability of the humans whom examine the prints to discern exact characteristics. We now have the ability to use optical scans to determine fingerprints of an individual exactly, as opposed to putting ink on file. If the thumbprints matches were able to be tested by a computer, it would be highly unlikely, given our model, that anyone would ever be misidentified. Team 250 Page 12 Comparing the output of our model with the probabilities of error in DNA analysis, we find that fingerprints are a much more accurate method of identification. Though everyone except identical twins and clones has a unique sequence of DNA, for criminology, the exact sequence is not actually used as evidence. Instead, DNA is cut up with an enzyme into Restriction fragment length polymorphisms (RFLPs). These pieces of DNA are then run out on a gel, which separates it out by the size of the segment [8]. Accordingly, if two or more people simply have restriction sites in approximately the same area, or even have the same amounts of DNA between restriction sites, they can be mistaken for one another. This is a much higher probability than if the exact sequence were taken into account. Accordingly, though misidentification is rare, the probability of misidentification in DNA analysis is on the order of one in ten billion, while according to our data that of fingerprint analysis is much lower [5]. VI. Conclusion Initially, this problem aroused in us many concerns. What if one of us really had a thumb Doppelganger? We could be convicted for crimes we had never committed! This situation would be most unfortunate. However, after running our model under a case of maximum probability, we discovered that there is a better chance of misidentification through DNA profiling if the fingerprint analysis is conducted with minimal human error. This is plainly evident in the fact that the odds of misidentification of DNA evidence, regarded in legal and public opinion as nearly infallible, has a probability of misidentification on the order of 10-10, while the odds of fingerprint misidentification is four orders of magnitude less, according to our model. Needless to say, it seems unreasonable to deny fingerprint profiling as evidence in a criminal trial. Appendix A: Shared Characteristics of a Population The following plots were used to determine the optimum figure for identification of criminals based on fingerprint evidence that is given in section IV. Team 250 Page 13 Number of like thumbprints, 0% dependence, N=1200 1.E+10 Number of people with thumbprint 1.E+05 1.E+00 1.E-05 1.E-10 1.E-15 1.E-20 1.E-25 0 5 10 15 20 25 30 35 Number of GCs US Most World Most China Most Lichtenstein Most Ever Most Figure 5: Plot of the number of probable like thumbprints in a given country using the model of zero percent pattern dependence. This shows that if only 10 minutiae are required to match, then it is likely that no one in the history of the world has had an exactly matching whole thumbprint. Number of like thumbprints, 25% dependence, N=1200 1.00E+11 Number of people with thumbprint 1.00E+06 1.00E+01 1.00E-04 1.00E-09 1.00E-14 1.00E-19 1.00E-24 0 5 10 15 20 25 30 35 Number of GCs US Most World Most China Most Lichtenstein Most Ever Most Figure 6: Same as above, for 25% dependence model. Here, only 10 minutiae are required for positive identification as well. Team 250 Page 14 Number of like thumbprints, 50% dependence, N=1200 1.00E+09 Number of people with thumbprint 1.00E+04 1.00E-01 1.00E-06 1.00E-11 1.00E-16 1.00E-21 0 5 10 15 20 25 30 35 Number of GCs US Most World Most China Most Lichtenstein Most Ever Most Figure 7: Same as above, for the 50% pattern dependence model. Here, around 12 characteristics are required for a highly probable identification. The difference here is likely caused by error in our knowledge of pattern frequencies. Number of like thumbprints, 100% dependence N=1200 1.00E+09 Number of people with thumbprint 1.00E+04 1.00E-01 1.00E-06 1.00E-11 1.00E-16 1.00E-21 1.00E-26 0 5 10 15 20 25 30 35 Number of GCs US Most World Most China Most Lichtenstein Most Ever Most Figure 8: Same as above, for the complete dependence model. Again, only about 10 characteristics are required for a positive identification. Team 250 Page 15 Appendix B: Shared Partial Print Characteristics of a Population The following plots were used to determine the optimum number of GCs to match up within a given population if only partial prints are available for comparison. Number of like thumbprints, 0% dependence, N=600 1.00E+13 Number of people with thumbprint 1.00E+08 1.00E+03 1.00E-02 1.00E-07 1.00E-12 1.00E-17 1.00E-22 0 5 10 15 20 25 30 35 Number of GCs US Most World Most China Most Lichtenstein Most Ever Most Figure 9: A plot of the number of possible like half-thumbprints, given zero dependence on fingerprint pattern. Number of like thumbprints, 100% dependence, N=600 1.00E+13 Number of people with thumbprint 1.00E+08 1.00E+03 1.00E-02 1.00E-07 1.00E-12 1.00E-17 1.00E-22 0 5 10 15 20 25 30 35 Number of GC's US Most World Most China Most Lichtenstein Ever Most Figure 10: A plot of the number of possible like half-thumbprints, given one hundred percent dependence on fingerprint pattern. Team 250 Page 16 Number of like thumbprints, 0% dependence, N=300 1.00E+12 Number of people with thumbprint 1.00E+07 1.00E+02 1.00E-03 1.00E-08 1.00E-13 1.00E-18 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Number of GCs US Most World Most China Most Lichtenstein Most Ever Most Figure 11: A plot of the number of possible like quarter-thumbprints, given zero dependence on fingerprint pattern. Number of like thumbprints, 100% dependency, N=300 1.00E+12 Number of people with thumbprint 1.00E+07 1.00E+02 1.00E-03 1.00E-08 1.00E-13 1.00E-18 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Number of GCs US Most World Most China Most Lichtenstein Most Ever Most Figure 12: A plot of the number of possible like quarter-thumbprints, given one hundred percent dependence on fingerprint pattern.Appendix C: Table of Calculated Probabilities These probabilities were calculated using various past models by Pankanti [6]. As noted earlier, our model, which predicts a value less than 4 x 10-15 for the probability of each individual fingerprint, is in good agreement with these calculations. Team 250 Page 17 Author Pfp Galton (1892) 1 16 1 256 1 2 Pearson(1930) 1 16 1 256 1 36 R R n=36, R=24, M=72 1.45x10-11 N=12, R=8, M=72 9.54x10-7 1.09x10-41 8.65x10-17 Henry(1900) 1 4 N 2 1.32x10-23 3.72x10-9 Balthazard(1911) 1 4 N 2.12x10-22 5.96x10-8 Bose(1917) 1 4 N 2.12x10-22 5.96x10-8 Wentworh & Wilder (1918) 1 50 6.87x10-62 4.10x10-21 2.22x10-63 1.32x10-22 1.00x10-38 1.00x10-14 3.75x10-47 3.35x10-18 2.47x10-26 2.91x10-9 1.33x10-27 3.05x10-15 1.2x10-80 3.5x10-26 N Cummins & Midlo (1943) 1 31 Gupta (1968) 1 1 10 10 Roxburgh (1933) 1 1000 Trauring (1963) (0.1944)N Osterburg et al. (1980) Stoney (1985) 1 50 N 1 10 N 1.5 10 2.412 N (0.76)MN234 N 0.6 (0 .5 10 3 ) N 5 1 TABLE 4: Calculated probabilities for various models. Obtained from Pankanti, et. al. [6]. Here, R is the number of regions of a fingerprint considered as defined by Galton, M is the number of regions as defined by Osterburg. References [1] J.Osterburg, et al., “Development of a Mathematical Formula for the Calculation of Fingerprint Probabilities Based on Individual Characteristics”, Journal of the American Statistical Association, Vol. 72, No. 360, pg 772-778, 1977 [2] S. L. Sclove, “The Occurrence of Fingerprint Characteristics as a Two Dimensional Process”, Journal of American Statistical Association, Vol. 74, No. 367, pp. 588-595, 1979 Team 250 Page 18 [3] James F. Cowger, Friction Ridge Skin: Comparison and Identification of Fingerprints, Elsevier Science Publishing Co. Inc., New York, New York, 1983. [4] The Noble Qur’an: In the English Language, Dr. Muhammad Taqi-un-Din Al-Hilali. Riyadh, Houston, Lahore: Darussalam Publishers and Distributors, 1998. [5] “DNA Fingerprinting.” The Columbia Encyclopedia, Sixth Edition. New York: Columbia University Press, 2003 [6] Sharath Pankanti, et al., “On the Individuality of Fingerprints” http://biometrics.cse.msu.edu/2cvpr230.pdf [7] Robert Epstein, “Fingerprints Meet Daubert: The Myth of Fingerprint “Science” is Revealed”, Southern California Law Review, Vol. 75, pp. 605-658, 2002 [8] Anthony J. F. Griffiths, Modern Genetic Analysis, W. H. Freeman and Company, New York, Mew York, 2002.

Given the possible number of genetic variations, the probabilit

Related documents

Products

Support

Given the possible number of genetic variations, the probabilit

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib