file

advertisement
𝑃=(
(𝐻𝑟 + 𝑉𝑟 ) ∗ 𝑅𝐶
𝐻𝑔 − 𝐷
2
) ∗
1
8
P – Probability of false-positive integration
Hr – Number of host reads
Vr – Number of virus reads
C/R – Artificial chimeric read generation proportion for Multiple Displacement Amplification [21]
Hg – Number of expected host nucleotides
D – Size of expected host deletion during integration
1/8 – The probability both chimeras will be in the proper orientation
False-Positive Integration Probability Calculations
Salmonella Dataset
Hr = 1,920,721 Vr = 14,546 C/R = 4.5e-3 Hg = 4,685,848 D = 0
P = 4.3e-7
HCC Dataset T198
Hr + Vr = 96,911,230 C/R = 4.5e-3 Hg = 2,897,310,462 D = 500
P = 2.8e-9
HCC Dataset T268
Hr + Vr = 105,628,475 C/R = 4.5e-3 Hg = 2,897,310,462 D = 500
P = 3.4e-9
Download