Supplementary Material for “Identifying Mutation Regions for Closely Related Individuals without a Known Pedigree” 1 The experiments for pedigrees containing 5 generations We study different sets of input individuals in the latest two generations of Pedigree 2-4 in the paper, where there are 5 generations in those pedigrees. Those different sets of input individuals in the latest two generations in the pedigree are given in Figures 1-3. The results are shown in Table 1-3. We can see that the values of recall are very close to 100% in all the cases. The value of precision is getting better when the number of diseased individuals increases. 2 The experiments for pedigrees containing 6 generations We also do experiments on the pedigrees containing 6 generations. Here we consider Pedigree 5-8 as shown in Figure 4-7. The results are shown as Table 3 in the paper. We study different sets of input individuals in the latest two generations of Pedigree 5-8, where there are 6 generations in those pedigrees. Those different sets of input individuals in the latest two generations in the pedigree are given in Figures 8-11. The results are shown in Table 4-7. We can see that the values of recall are more than 90% in most cases. The value of precision is getting better when the number of diseased individuals increases. The behavior is similar to that of 5 generations. 1 Figure 1: The different sets of input individuals based on Pedigree 2 in the paper. 2 Figure 2: The different sets of input individuals based on Pedigree 3 in the paper. 3 Figure 3: The different sets of input individuals based on Pedigree 4 in the paper. 4 input 3d-5fam-1 3d-5fam-2 3d-5fam-3 3d-4fam-1 3d-4fam-2 3d-4fam-3 3d-3fam-1 3d-3fam-2 3d-3fam-3 3d-2fam-1 3d-2fam-2 3d-2fam-3 precision 79.83% 74.71% 65.29% 78.80% 68.07% 59.62% 76.77% 66.28% 56.29% 72.23% 57.69% 54.67% recall 99.04% 98.54% 97.73% 99.27% 98.63% 95.75% 99.26% 98.66% 95.06% 97.99% 95.53% 94.32% Table 1: Results on Figure 1 input 4d-5fam-1 4d-5fam-2 4d-5fam-3 4d-4fam-1 4d-4fam-2 4d-4fam-3 4d-3fam-1 4d-3fam-2 4d-3fam-3 4d-2fam-1 4d-2fam-2 4d-2fam-3 precision 87.44% 85.13% 80.54% 85.85% 82.04% 78.82% 83.81% 80.54% 75.52% 83.78% 79.76% 74.53% recall 96.53% 97.11% 97.41% 96.70% 97.08% 97.68% 96.49% 96.58% 97.58% 96.37% 96.76% 97.45% Table 2: Results on Figure 2 5 input 5d-5fam-1 5d-5fam-2 5d-5fam-3 5d-5fam-4 5d-4fam-1 5d-4fam-2 5d-4fam-3 5d-4fam-4 5d-3fam-1 5d-3fam-2 5d-3fam-3 5d-3fam-4 precision 91.35% 90.79% 90.55% 88.67% 91.02% 90.43% 90.56% 88.76% 92.26% 92.25% 92.16% 89.84% recall 99.26% 99.12% 99.14% 99.08% 99.40% 99.22% 99.21% 99.26% 99.42% 99.18% 98.97% 99.19% Table 3: Results on Figure 3 Figure 4: Pedigree 5: a pedigree containing 6 generations with 2 diseased individuals in the input. 6 Figure 5: Pedigree 6: a pedigree containing 6 generations with 3 diseased individuals in the input. Figure 6: Pedigree 7: a pedigree containing 6 generations with 4 diseased individuals in the input. 7 Figure 7: Pedigree 8: a pedigree containing 6 generations with 5 diseased individuals in the input. input 6g-2d-5fam-1 6g-2d-5fam-2 6g-2d-5fam-3 6g-2d-4fam-1 6g-2d-4fam-2 6g-2d-4fam-3 6g-2d-3fam-1 6g-2d-3fam-2 6g-2d-3fam-3 6g-2d-2fam-1 6g-2d-2fam-2 6g-2d-2fam-3 precision 72.38% 61.48% 53.47% 69.38% 56.54% 49.21% 65.74% 47.10% 41.66% 61.65% 37.10% 34.01% Table 4: Results on Figure 8 8 recall 98.34% 98.25% 97.43% 98.43% 97.75% 96.76% 98.38% 96.59% 89.74% 97.47% 90.78% 80.63% Figure 8: The different sets of input individuals based on Pedigree 5 4 9 Figure 9: The different sets of input individuals based on Pedigree 6 10 input 6g-3d-5fam-1 6g-3d-5fam-2 6g-3d-5fam-3 6g-3d-4fam-1 6g-3d-4fam-2 6g-3d-4fam-3 6g-3d-3fam-1 6g-3d-3fam-2 6g-3d-3fam-3 6g-3d-2fam-1 6g-3d-2fam-2 6g-3d-2fam-3 precision 70.18% 65.75% 54.38% 66.27% 58.36% 49.09% 62.53% 50.73% 41.44% 60.18% 45.59% 35.73% recall 97.40% 98.17% 97.12% 98.00% 96.53% 95.46% 97.36% 95.05% 91.49% 96.28% 92.51% 83.58% Table 5: Results on Figure 9 input 6g-4d-5fam-1 6g-4d-5fam-2 6g-4d-5fam-3 6g-4d-5fam-4 6g-4d-4fam-1 6g-4d-4fam-2 6g-4d-4fam-3 6g-4d-4fam-4 6g-4d-3fam-1 6g-4d-3fam-2 6g-4d-3fam-3 6g-4d-3fam-4 precision 90.16% 90.04% 89.05% 84.16% 90.58% 90.14% 88.74% 82.41% 90.30% 89.41% 87.61% 81.54% recall 97.10% 97.16% 97.20% 97.71% 97.42% 97.49% 97.83% 98.34% 97.63% 97.69% 98.01% 98.51% Table 6: Results on Figure 10 11 Figure 10: The different sets of input individuals based on Pedigree 7 12 Figure 11: The different sets of input individuals based on Pedigree 8 13 input 6g-5d-5fam-1 6g-5d-5fam-2 6g-5d-5fam-3 6g-5d-5fam-4 6g-5d-4fam-1 6g-5d-4fam-2 6g-5d-4fam-3 6g-5d-4fam-4 6g-5d-3fam-1 6g-5d-3fam-2 6g-5d-3fam-3 6g-5d-3fam-4 precision 89.56% 88.91% 87.72% 87.28% 89.56% 89.11% 87.69% 86.67% 89.42% 89.05% 88.04% 86.40% recall 96.58% 96.58% 96.35% 97.08% 97.08% 97.05% 96.85% 97.08% 97.08% 97.08% 97.35% 97.58% Table 7: Results on Figure 11 3 The experiments for pedigrees containing 7 generations The pedigrees shown in Figure 12-15 contain 7 generations and 2, 3, 4, 5 diseased individuals in the latest generation. Only the individuals in the latest generation are the input individuals. The experiment results are shown in Table 4 in the paper. Figure 16-19 show the different sets of input individuals in the latest two generations of Pedigree 9-12.The results are shown in Table 8-11. The performance of our program for 7 generations is similar to that for 5 and 6 generations but slightly worse than them. We do 200 experiments for all the cases mentioned above. 14 Figure 12: Pedigree 9: a pedigree containing 7 generations with 2 diseased individuals in the input. Figure 13: Pedigree 10: a pedigree containing 7 generations with 3 diseased individuals in the input. 15 Figure 14: Pedigree 11: a pedigree containing 7 generations with 4 diseased individuals in the input. Figure 15: Pedigree 12: a pedigree containing 7 generations with 5 diseased individuals in the input. 16 Figure 16: The different sets of input individuals based on Pedigree 9 17 input 7g-2d-5fam-1 7g-2d-5fam-2 7g-2d-5fam-3 7g-2d-4fam-1 7g-2d-4fam-2 7g-2d-4fam-3 7g-2d-3fam-1 7g-2d-3fam-2 7g-2d-3fam-3 7g-2d-2fam-1 7g-2d-2fam-2 7g-2d-2fam-3 precision 74.83% 64.95% 55.08% 69.49% 58.55% 50.80% 58.11% 51.23% 43.30% 48.75% 46.30% 37.80% recall 96.79% 96.80% 97.41% 96.87% 96.87% 96.32% 95.29% 94.57% 93.78% 92.04% 92.14% 87.33% Table 8: Results on Figure 16 input 7g-3d-5fam-1 7g-3d-5fam-2 7g-3d-5fam-3 7g-3d-4fam-1 7g-3d-4fam-2 7g-3d-4fam-3 7g-3d-3fam-1 7g-3d-3fam-2 7g-3d-3fam-3 7g-3d-2fam-1 7g-3d-2fam-2 7g-3d-2fam-3 precision 77.44% 72.88% 61.93% 72.40% 66.60% 54.46% 57.48% 57.71% 45.69% 50.98% 55.19% 42.25% recall 98.57% 98.67% 97.89% 98.27% 98.76% 97.34% 97.76% 96.57% 92.68% 95.24% 94.11% 89.19% Table 9: Results on Figure 17 18 Figure 17: The different sets of input individuals based on Pedigree 10 19 Figure 18: The different sets of input individuals based on Pedigree 11 20 input 7g-4d-5fam-1 7g-4d-5fam-2 7g-4d-5fam-3 7g-4d-5fam-4 7g-4d-4fam-1 7g-4d-4fam-2 7g-4d-4fam-3 7g-4d-4fam-4 7g-4d-3fam-1 7g-4d-3fam-2 7g-4d-3fam-3 7g-4d-3fam-4 precision 87.51% 88.43% 86.93% 85.25% 87.53% 88.58% 86.63% 83.31% 85.66% 86.29% 85.34% 81.05% recall 97.21% 97.25% 97.37% 97.30% 97.28% 97.18% 97.41% 97.40% 97.12% 97.28% 97.43% 97.32% Table 10: Results on Figure 18 input 7g-5d-5fam-1 7g-5d-5fam-2 7g-5d-5fam-3 7g-5d-5fam-4 7g-5d-4fam-1 7g-5d-4fam-2 7g-5d-4fam-3 7g-5d-4fam-4 7g-5d-3fam-1 7g-5d-3fam-2 7g-5d-3fam-3 7g-5d-3fam-4 precision 91.02% 91.20% 89.45% 88.32% 90.85% 91.22% 88.87% 87.45% 89.62% 90.33% 89.24% 87.51% recall 96.57% 96.66% 96.70% 96.46% 96.92% 96.88% 96.92% 96.59% 96.99% 96.98% 96.79% 96.94% Table 11: Results on Figure 19 21 Figure 19: The different sets of input individuals based on Pedigree 12 22