Supplementary Material for “Identifying Mutation Known Pedigree”

advertisement
Supplementary Material for “Identifying Mutation
Regions for Closely Related Individuals without a
Known Pedigree”
1
The experiments for pedigrees containing 5 generations
We study different sets of input individuals in the latest two generations of
Pedigree 2-4 in the paper, where there are 5 generations in those pedigrees.
Those different sets of input individuals in the latest two generations in the
pedigree are given in Figures 1-3. The results are shown in Table 1-3. We
can see that the values of recall are very close to 100% in all the cases. The
value of precision is getting better when the number of diseased individuals
increases.
2
The experiments for pedigrees containing 6 generations
We also do experiments on the pedigrees containing 6 generations. Here
we consider Pedigree 5-8 as shown in Figure 4-7. The results are shown as
Table 3 in the paper.
We study different sets of input individuals in the latest two generations
of Pedigree 5-8, where there are 6 generations in those pedigrees. Those
different sets of input individuals in the latest two generations in the pedigree
are given in Figures 8-11. The results are shown in Table 4-7. We can see
that the values of recall are more than 90% in most cases. The value of
precision is getting better when the number of diseased individuals increases.
The behavior is similar to that of 5 generations.
1
Figure 1: The different sets of input individuals based on Pedigree 2 in the
paper.
2
Figure 2: The different sets of input individuals based on Pedigree 3 in the
paper.
3
Figure 3: The different sets of input individuals based on Pedigree 4 in the
paper.
4
input
3d-5fam-1
3d-5fam-2
3d-5fam-3
3d-4fam-1
3d-4fam-2
3d-4fam-3
3d-3fam-1
3d-3fam-2
3d-3fam-3
3d-2fam-1
3d-2fam-2
3d-2fam-3
precision
79.83%
74.71%
65.29%
78.80%
68.07%
59.62%
76.77%
66.28%
56.29%
72.23%
57.69%
54.67%
recall
99.04%
98.54%
97.73%
99.27%
98.63%
95.75%
99.26%
98.66%
95.06%
97.99%
95.53%
94.32%
Table 1: Results on Figure 1
input
4d-5fam-1
4d-5fam-2
4d-5fam-3
4d-4fam-1
4d-4fam-2
4d-4fam-3
4d-3fam-1
4d-3fam-2
4d-3fam-3
4d-2fam-1
4d-2fam-2
4d-2fam-3
precision
87.44%
85.13%
80.54%
85.85%
82.04%
78.82%
83.81%
80.54%
75.52%
83.78%
79.76%
74.53%
recall
96.53%
97.11%
97.41%
96.70%
97.08%
97.68%
96.49%
96.58%
97.58%
96.37%
96.76%
97.45%
Table 2: Results on Figure 2
5
input
5d-5fam-1
5d-5fam-2
5d-5fam-3
5d-5fam-4
5d-4fam-1
5d-4fam-2
5d-4fam-3
5d-4fam-4
5d-3fam-1
5d-3fam-2
5d-3fam-3
5d-3fam-4
precision
91.35%
90.79%
90.55%
88.67%
91.02%
90.43%
90.56%
88.76%
92.26%
92.25%
92.16%
89.84%
recall
99.26%
99.12%
99.14%
99.08%
99.40%
99.22%
99.21%
99.26%
99.42%
99.18%
98.97%
99.19%
Table 3: Results on Figure 3
Figure 4: Pedigree 5: a pedigree containing 6 generations with 2 diseased
individuals in the input.
6
Figure 5: Pedigree 6: a pedigree containing 6 generations with 3 diseased
individuals in the input.
Figure 6: Pedigree 7: a pedigree containing 6 generations with 4 diseased
individuals in the input.
7
Figure 7: Pedigree 8: a pedigree containing 6 generations with 5 diseased
individuals in the input.
input
6g-2d-5fam-1
6g-2d-5fam-2
6g-2d-5fam-3
6g-2d-4fam-1
6g-2d-4fam-2
6g-2d-4fam-3
6g-2d-3fam-1
6g-2d-3fam-2
6g-2d-3fam-3
6g-2d-2fam-1
6g-2d-2fam-2
6g-2d-2fam-3
precision
72.38%
61.48%
53.47%
69.38%
56.54%
49.21%
65.74%
47.10%
41.66%
61.65%
37.10%
34.01%
Table 4: Results on Figure 8
8
recall
98.34%
98.25%
97.43%
98.43%
97.75%
96.76%
98.38%
96.59%
89.74%
97.47%
90.78%
80.63%
Figure 8: The different sets of input individuals based on Pedigree 5 4
9
Figure 9: The different sets of input individuals based on Pedigree 6
10
input
6g-3d-5fam-1
6g-3d-5fam-2
6g-3d-5fam-3
6g-3d-4fam-1
6g-3d-4fam-2
6g-3d-4fam-3
6g-3d-3fam-1
6g-3d-3fam-2
6g-3d-3fam-3
6g-3d-2fam-1
6g-3d-2fam-2
6g-3d-2fam-3
precision
70.18%
65.75%
54.38%
66.27%
58.36%
49.09%
62.53%
50.73%
41.44%
60.18%
45.59%
35.73%
recall
97.40%
98.17%
97.12%
98.00%
96.53%
95.46%
97.36%
95.05%
91.49%
96.28%
92.51%
83.58%
Table 5: Results on Figure 9
input
6g-4d-5fam-1
6g-4d-5fam-2
6g-4d-5fam-3
6g-4d-5fam-4
6g-4d-4fam-1
6g-4d-4fam-2
6g-4d-4fam-3
6g-4d-4fam-4
6g-4d-3fam-1
6g-4d-3fam-2
6g-4d-3fam-3
6g-4d-3fam-4
precision
90.16%
90.04%
89.05%
84.16%
90.58%
90.14%
88.74%
82.41%
90.30%
89.41%
87.61%
81.54%
recall
97.10%
97.16%
97.20%
97.71%
97.42%
97.49%
97.83%
98.34%
97.63%
97.69%
98.01%
98.51%
Table 6: Results on Figure 10
11
Figure 10: The different sets of input individuals based on Pedigree 7
12
Figure 11: The different sets of input individuals based on Pedigree 8
13
input
6g-5d-5fam-1
6g-5d-5fam-2
6g-5d-5fam-3
6g-5d-5fam-4
6g-5d-4fam-1
6g-5d-4fam-2
6g-5d-4fam-3
6g-5d-4fam-4
6g-5d-3fam-1
6g-5d-3fam-2
6g-5d-3fam-3
6g-5d-3fam-4
precision
89.56%
88.91%
87.72%
87.28%
89.56%
89.11%
87.69%
86.67%
89.42%
89.05%
88.04%
86.40%
recall
96.58%
96.58%
96.35%
97.08%
97.08%
97.05%
96.85%
97.08%
97.08%
97.08%
97.35%
97.58%
Table 7: Results on Figure 11
3
The experiments for pedigrees containing 7 generations
The pedigrees shown in Figure 12-15 contain 7 generations and 2, 3, 4, 5 diseased individuals in the latest generation. Only the individuals in the latest
generation are the input individuals. The experiment results are shown in
Table 4 in the paper.
Figure 16-19 show the different sets of input individuals in the latest
two generations of Pedigree 9-12.The results are shown in Table 8-11. The
performance of our program for 7 generations is similar to that for 5 and
6 generations but slightly worse than them. We do 200 experiments for all
the cases mentioned above.
14
Figure 12: Pedigree 9: a pedigree containing 7 generations with 2 diseased
individuals in the input.
Figure 13: Pedigree 10: a pedigree containing 7 generations with 3 diseased
individuals in the input.
15
Figure 14: Pedigree 11: a pedigree containing 7 generations with 4 diseased
individuals in the input.
Figure 15: Pedigree 12: a pedigree containing 7 generations with 5 diseased
individuals in the input.
16
Figure 16: The different sets of input individuals based on Pedigree 9
17
input
7g-2d-5fam-1
7g-2d-5fam-2
7g-2d-5fam-3
7g-2d-4fam-1
7g-2d-4fam-2
7g-2d-4fam-3
7g-2d-3fam-1
7g-2d-3fam-2
7g-2d-3fam-3
7g-2d-2fam-1
7g-2d-2fam-2
7g-2d-2fam-3
precision
74.83%
64.95%
55.08%
69.49%
58.55%
50.80%
58.11%
51.23%
43.30%
48.75%
46.30%
37.80%
recall
96.79%
96.80%
97.41%
96.87%
96.87%
96.32%
95.29%
94.57%
93.78%
92.04%
92.14%
87.33%
Table 8: Results on Figure 16
input
7g-3d-5fam-1
7g-3d-5fam-2
7g-3d-5fam-3
7g-3d-4fam-1
7g-3d-4fam-2
7g-3d-4fam-3
7g-3d-3fam-1
7g-3d-3fam-2
7g-3d-3fam-3
7g-3d-2fam-1
7g-3d-2fam-2
7g-3d-2fam-3
precision
77.44%
72.88%
61.93%
72.40%
66.60%
54.46%
57.48%
57.71%
45.69%
50.98%
55.19%
42.25%
recall
98.57%
98.67%
97.89%
98.27%
98.76%
97.34%
97.76%
96.57%
92.68%
95.24%
94.11%
89.19%
Table 9: Results on Figure 17
18
Figure 17: The different sets of input individuals based on Pedigree 10
19
Figure 18: The different sets of input individuals based on Pedigree 11
20
input
7g-4d-5fam-1
7g-4d-5fam-2
7g-4d-5fam-3
7g-4d-5fam-4
7g-4d-4fam-1
7g-4d-4fam-2
7g-4d-4fam-3
7g-4d-4fam-4
7g-4d-3fam-1
7g-4d-3fam-2
7g-4d-3fam-3
7g-4d-3fam-4
precision
87.51%
88.43%
86.93%
85.25%
87.53%
88.58%
86.63%
83.31%
85.66%
86.29%
85.34%
81.05%
recall
97.21%
97.25%
97.37%
97.30%
97.28%
97.18%
97.41%
97.40%
97.12%
97.28%
97.43%
97.32%
Table 10: Results on Figure 18
input
7g-5d-5fam-1
7g-5d-5fam-2
7g-5d-5fam-3
7g-5d-5fam-4
7g-5d-4fam-1
7g-5d-4fam-2
7g-5d-4fam-3
7g-5d-4fam-4
7g-5d-3fam-1
7g-5d-3fam-2
7g-5d-3fam-3
7g-5d-3fam-4
precision
91.02%
91.20%
89.45%
88.32%
90.85%
91.22%
88.87%
87.45%
89.62%
90.33%
89.24%
87.51%
recall
96.57%
96.66%
96.70%
96.46%
96.92%
96.88%
96.92%
96.59%
96.99%
96.98%
96.79%
96.94%
Table 11: Results on Figure 19
21
Figure 19: The different sets of input individuals based on Pedigree 12
22
Download