A problem of parcel tracking numbers. Carl James Schwarz (P.Stat. (Canada), PStat (US) Department of Statistics and Actuarial Science Simon Fraser University Burnaby, BC, V5A 1S6 cschwarz@stat.sfu.a This reports deals with various issues about a Canada Post parcel tracking number found a slip of paper (CP 001 201 740 SY) that was found in possession of a suspect and its relationship to a tracking number of a parcel of interest (CP 001 201 748 SY). The two tracking numbers differ in the last digit before the final alphabetic suffix. Information on the construction of tracking numbers was provided by an employee of Canada Post and is also available at http://en.wikipedia.org/wiki/Canada_Post “Canada Post uses a 13 character barcode for their pre-printed labels. Bar codes consist of two letters, followed by eight sequence digits, and a ninth digit which is the check digit. The last two characters are the letters CA. The check digit seems to ignore the letters and only concern itself with the first 8 numeric digits. The scheme is to multiply each of those 8 digits by a different weighting factor, (8 6 4 2 3 5 9 7). Add up the total of all of these multiplications and divide by 11. The remainder after dividing by 11 gives a number from 0 to 10. Subtracting this from 11 gives a number from 1 to 11. That result is the check digit, except in the two cases where it is 10 or 11. If 10 it is then changed to a 0, and if 11 then it is changed to a 5. The check digit may be used to verify if a barcode scan is correct, or if a manual entry of the barcode is correct.” Application of the above procedure to the tracking number on the slip of paper shows that the number on the slip of paper is not a valid tracking number as the final check digit should be an 8 rather than the 0. Can a probability be determined that a suspect would have a tracking number that is off by one digit from the tracking number on the parcel? In short, except for the trivial case, no. Computation of a formal probability depends on the frequency of the event of interest divided by the set of possible events (assuming all possible events are equally probable). The latter is impossible to determine except in the trivial case. For example, suppose that in a purse snatching, a customer lost a single Lotto 6/49 ticket. The customer remembers what numbers were picked. Later a suspect is found to have a Lotto 6/49 ticket in their possession with the same set of 6 numbers. What is the probability that this could happen? [Ignoring information on place of purchase etc that can be determined from the actual tickets.] 1 In the trivial case, if the suspect claims that he/she randomly picked the numbers on the ticket in his/her possession, and it was only a coincidence that the six number matched, then we can now compute a probability. The set of all possible events are all the possible sets of 6 numbers that could be chosen in Lotto 6/49 (about 14,000,000). The probability that the suspect would randomly pick a set of 6 numbers that would match the stolen ticket is 1/14,000,000. While theoretically possible, it is not plausible that the suspect would randomly pick six numbers that matched the stolen ticket. However, suppose the suspect claims that the numbers represent the birthdays of favorite movie stars and the numbers were not chosen at random. Now it is impossible to compute a probability because it is impossible to know how many movie stars are known by the suspect, and if the suspect knows the stars’ birthdays. About the only thing that can be done in this case is to work backwards – are there indeed a set of movie stars whose birthdays are known and match the numbers on the stolen ticket. If there is no set of six movie stars with the corresponding birthdays then this would eliminate this rationale, but no probability can be attached. In the case of a parcel, if the number in possession of the suspect were generated at random, then the probability for this trivial case can be determined. The set of possible sets of 9 numbers that could be generated at random is one billion. There are 9 possible tracking numbers that match the parcel in the first 8 of the 9 digits, but don’t match on one digit. For example. there are 9 possibilities that look like 001 210 74x where x does not match. There are also 9 possibilities where the tracking number could have mismatched in the second to last digit etc. In total the probability is approximately 81/1,000,000,000 (81 in a billion) that a randomly generated set of 9 digits would be off by 1 digit somewhere in the nine numbers compared to the parcel tracking number. So under this scenario, while theoretically possible, it is not plausible that a random generation of numbers would match in all but one digit. Can the way the tracking numbers are generated be used to compute a probability? The above analysis ignores the role of the check digit in parcel tracking numbers. The set of numbers in possession of the suspect looks like, but are not a valid tracking number. As noted in the report from Canada Post, the last digit in the number does not occur at random but is computed using a formula. The tracking number should end in an 8 but the number on the suspect ends in a 0. Single digit mistakes: Suppose the suspect argues that the number in their possession is for a different parcel but a mistake was made in one of the digits. For example, the suspect intended to write down CP 101 201 740 SY but “accidentally” wrote down CP 001 201 740 SY. I examined all possible tracking numbers that were off by one digit from the parcel tracking number (e.g. 101 201 740, 201 210 740 etc) to determine if there are valid tracking numbers for which an error could have lead to the close match. This list is found in Table 1 at the end of this report. There were only 8 potential tracking numbers for 2 which a one digit error could lead to the close match. For example, the valid tracking number 801 210 740 may have been written down as 001 210 740 leading to the close match. For each of these potential tracking numbers, there is a 1/10 chance that the digit that will be transcribed in error would be chosen. What is the probability that the suspect would indeed have one of these 8 potential valid tracking numbers? It should be straight forward to have Canada Post examine this set of 8 potential tracking numbers to see if these have been ever used or were in transit, and to where they where delivered. If none of the 8 potential tracking numbers that could lead to a “close match though a single digit transcription” exist or be tracked by the suspect, then there would be no valid tracking numbers that a suspect could have been tracking and this line of defense can be eliminated. Two digit transpositions: A similar approach was used to see if there are any valid tracking numbers for which a transposition of two digits would lead to the observed close match. I looked a all possible tracking numbers which would result in the number in possession of the suspect if a two digit transposition was “accidentally” made (Table 2). There were only two valid tracking numbers (000 210 741 - transpose the 3rd and 9th digit; and 001 710 240 - transpose the 4th and 7th digit) for which a two digit transposition would result in a close match. For each of these two potential tracking numbers, there is a 1/45 chance that those two specific digits would be transposed to get the close match. Again, these numbers should be checked with Canada Post and they have not been assigned or used, then this eliminates this line of defense. Three digit scrambling: The tracking numbers are often written in groups of 3 (as done in this report). Perhaps a transposition among a group of 3 digits occurred (e.g. the 201 in the middle really was 012). I examined all possible tracking numbers where a transposition among the digits in one of the groups of 3 digits could have lead to the close match seen. There are no valid tracking numbers for which this could have occurred indicating that this type of error could not be used to “explain” why the number in possession of the suspect were a close match to the parcel’s numbers. Other errors: There are many possible errors that could be examined in a similar fashion. Please let me know if you require further information Carl James Schwarz 3 Table 1 Check parcel codes off by 1 digit Parcel tracking Obs code Sum Is for tracking check Check code digit digit valid? 1 101 210 740 110 5 2 201 210 740 118 3 3 301 210 740 126 6 4 401 210 740 134 9 5 501 210 740 142 1 6 601 210 740 150 4 7 701 210 740 158 7 8 801 210 740 166 0 yes 9 901 210 740 174 2 10 011 210 740 108 2 11 021 210 740 114 7 12 031 210 740 120 1 13 041 210 740 126 6 14 051 210 740 132 5 15 061 210 740 138 5 16 071 210 740 144 0 yes 17 081 210 740 150 4 18 091 210 740 156 9 19 000 210 740 98 1 20 002 210 740 106 4 21 003 210 740 110 5 22 004 210 740 114 7 23 005 210 740 118 3 24 006 210 740 122 0 yes 25 007 210 740 126 6 26 008 210 740 130 2 27 009 210 740 134 9 4 Table 1 Check parcel codes off by 1 digit Parcel tracking Obs code Sum Is for tracking check Check code digit digit valid? 28 001 010 740 98 1 29 001 110 740 100 0 yes 30 001 310 740 104 6 31 001 410 740 106 4 32 001 510 740 108 2 33 001 610 740 110 5 34 001 710 740 112 9 35 001 810 740 114 7 36 001 910 740 116 5 37 001 200 740 99 5 38 001 220 740 105 5 39 001 230 740 108 2 40 001 240 740 111 0 yes 41 001 250 740 114 7 42 001 260 740 117 4 43 001 270 740 120 1 44 001 280 740 123 9 45 001 290 740 126 6 46 001 211 740 107 3 47 001 212 740 112 9 48 001 213 740 117 4 49 001 214 740 122 0 yes 50 001 215 740 127 5 51 001 216 740 132 5 52 001 217 740 137 6 53 001 218 740 142 1 54 001 219 740 147 7 55 001 210 040 39 5 5 Table 1 Check parcel codes off by 1 digit Parcel tracking Obs code Sum Is for tracking check Check code digit digit valid? 56 001 210 140 48 7 57 001 210 240 57 9 58 001 210 340 66 5 59 001 210 440 75 2 60 001 210 540 84 4 61 001 210 640 93 6 62 001 210 840 111 0 yes 63 001 210 940 120 1 64 001 210 700 74 3 65 001 210 710 81 7 66 001 210 720 88 5 67 001 210 730 95 4 68 001 210 750 109 1 69 001 210 760 116 5 70 001 210 770 123 9 71 001 210 780 130 2 72 001 210 790 137 6 73 001 210 741 102 8 74 001 210 742 102 8 75 001 210 743 102 8 76 001 210 744 102 8 77 001 210 745 102 8 78 001 210 746 102 8 79 001 210 747 102 8 80 001 210 748 102 8 yes 81 001 210 749 102 8 6 Table 2 Check parcel codes with transpose of 2 digits Parcel tracking Obs code Sum Is for tracking Transpose Transpose position position check Check code 1 2 digit digit valid? 1 001 210 740 1 2 102 8 2 100 210 740 1 3 106 4 3 201 010 740 1 4 114 7 4 101 200 740 1 5 107 3 5 001 210 740 1 6 102 8 6 701 210 040 1 7 95 4 7 401 210 700 1 8 106 4 8 001 210 740 1 9 102 8 9 010 210 740 2 3 104 6 10 021 010 740 2 4 110 5 11 011 200 740 2 5 105 5 12 001 210 740 2 6 102 8 13 071 210 040 2 7 81 7 14 041 210 700 2 8 98 1 15 001 210 740 2 9 102 8 16 002 110 740 3 4 104 6 17 001 210 740 3 5 102 8 18 000 211 740 3 6 103 7 19 007 210 140 3 7 72 5 20 004 210 710 3 8 93 6 21 000 210 741 3 9 98 1 yes 22 001 120 740 4 5 103 7 23 001 012 740 4 6 108 2 24 001 710 240 4 7 67 0 yes 25 001 410 720 4 8 92 7 26 001 010 742 4 9 98 1 27 001 201 740 5 6 104 6 28 001 270 140 5 7 66 5 29 001 240 710 5 8 90 9 30 001 200 741 5 9 99 5 7 Table 2 Check parcel codes with transpose of 2 digits Parcel tracking Obs code Sum Is for tracking Transpose Transpose position position check Check code 1 2 digit digit valid? 31 001 217 040 6 7 74 3 32 001 214 700 6 8 94 5 33 001 210 740 6 9 102 8 34 001 210 470 7 8 96 3 35 001 210 047 7 9 39 5 36 001 210 704 8 9 74 3 8 Table 3 Check parcel codes with transpose among groups of 3 digits Parcel tracking Obs code Sum Is for tracking Start of 3-digit check Check code group digit digit valid? 1 001 210 740 1 102 8 2 010 210 740 1 104 6 3 001 210 740 1 102 8 4 010 210 740 1 104 6 5 100 210 740 1 106 4 6 100 210 740 1 106 4 7 001 210 740 4 102 8 8 001 201 740 4 104 6 9 001 120 740 4 103 7 10 001 102 740 4 107 3 11 001 021 740 4 106 4 12 001 012 740 4 108 2 13 001 210 740 7 102 8 14 001 210 704 7 74 3 15 001 210 470 7 96 3 16 001 210 407 7 47 8 17 001 210 074 7 60 6 18 001 210 047 7 39 5 9