Whole-exome Sequencing Analysis Identifies Mutations in the EYS Gene in Retinitis Pigmentosa in the Indian Population Yanan Di1,2,3,7*, Lulin Huang2,3,7*, Periasamy Sundaresan4*, Shujin Li2,6,7, Ramasamy Kim5, Bibhuti Ballav Saikia4, Chao Qu2,7, Xiong Zhu2,3,7, Yu Zhou2,3,7, Zhilin Jiang2,7, Lin Zhang2,6,7, Ying Lin2,7, Dingding Zhang2,7, Yuanfen Li2,7, Houbin Zhang2,3,7, Yibing Yin1, Fang Lu2,3,7, Xianjun Zhu2,3,6,7#, Zhenglin Yang1,2,3,6,7# * These authors contributed equally to this study. Supplementary information file Supplementary information file includes three Tables (Table S1, S2,S3 and S4). Table S1 The primer pairs used for mutations identification by Sanger Sequence Mutations c.8422G>A Primer name EYS-8422 c.7868G>A EYS-7868 c.4606C>G EYS-4606,5038 c.5038A>G EYS-4606,5038 c.9059T>C EYS-9059 c.1418G>T EYS-1418 c.2971C>T EYS-2971 c.8455delA EYS-8455 c.8388C>A EYS-8388 c.7187C>G EYS-7187 c.2259+1 G>A EYS-splicing c.3024C>A EYS-3024 F R F R F R F R F R F R F R F R F R F R F R F R Primer pairs 5’-3’ CTGGCAAACATCTGCAAGAA ATCCAACTTGGCCAGAAACA ATGGCATAAATGCTGTGCTG TTCTCTGCGCATTTCTGTATTC GCCTCCATAAGTGCAACTCC CACTTGGGTGAAGTTTGAACAG GCCTCCATAAGTGCAACTCC CACTTGGGTGAAGTTTGAACAG TGCAGAAATGGAGGTGAATG CCATATTCAAAGCCCCCTAGA TCACTGTGGTTTTAAAAATTAGCTG CCATTAACCACTCCCTTCCA TGGTTTCCAGCTTCATCCAT ATTTTTGCCCTGTTTGCATC TCCGTTCAACTTCGCTACAA TCACCTCCATTTCTGCATGT TCCGTTCAACTTCGCTACAA TCACCTCCATTTCTGCATGT GCATATGTGTTCATGCATGTGT CCTGCTTGGTGATCAGTCTC TGTTTTTTCCAGTGGTTGATG AAACCAACCTGTATAGAGTGGAGA GAGGGTCTTCATTTCTTGGTGATG TCAACTTTCCCTTGATGTTAAGTC Table S2 Overview of data production Items ARRP 49-II:5 ARRP 206-IV:1 RP:S-2 RP:S-10 RP:S-14 Total reads Total yield (bp) Average throughput depth of target region Mapped to human genome De-duplicated by Picard tools Uniquely mapped to human genome Mapped to target regions % Coverage of target regions ≥1X % Coverage of target regions ≥10 X Mean read depth of target regions 49,8119,518 5,031,771,318 99.9 49,563,492 48,277,967 46,960,891 35,099,450 98.1% 94% 58.3 48,389,056 4,887,294,656 97.0 48,211,940 47,041,876 45,862,952 35,466,449 98.2% 94.9% 58.7 43,538,982 4,397,437,182 87.3 43,388,304 41,323,899 40,264,663 33,097,880 97.6% 91.4% 55.5 44,372,220 4,481,594,220 88.9 44,242,436 42,827,662 41,770,934 33,955,987 98.0% 93.2% 56.3 48,018,436 4,849,862,036 96.2 47,877,526 45,442,035 44,292,449 35,627,912 98.2% 94,2% 58.8 Number of SNPs Number of coding SNPs Number of synonymous SNPs Number of nonsynonymous SNPs Number of Indels Number of coding Indels 66,992 20,370 10,557 9,336 5,541 413 67,419 20,510 10,703 9,325 5,621 427 60,462 19,310 10,027 8,805 4,526 381 63,932 19,869 10,299 9,102 5.001 411 65,656 20,126 10,525 9,096 5,151 397 Table S2 Overview of data production (continued) Items RP:S-18 RP:S-22 RP:S-34 RP:S-40 RP:S-48 Total reads 52,316,262 Total yield (bp) 5,283,942,462 Average throughput depth of target region 104.9 49,518,246 5,001,342,846 99.3 46,810,310 4,727,841,310 93.8 41,988,552 4,240,843,752 84.2 48,627,488 4,911,376,288 97.5 Mapped to human genome De-duplicated by Picard tools Uniquely mapped to human genome Mapped to target regions % Coverage of target regions ≥1X % Coverage of target regions ≥10 X Mean read depth of target regions Number of SNPs Number of coding SNPs Number of synonymous SNPs Number of nonsynonymous SNPs Number of Indels Number of coding Indels 49,363,126 47,303,076 46,128,933 37,029,371 98.0% 93.4% 61,1 64,050 19,841 10,291 9,045 4,906 413 46,629,684 44,118,412 42,999,222 35,063,689 97.7% 91.8% 58.7 63,228 19,843 10,419 8,951 4,526 389 41,854,296 40,213,737 39,173,334 31,423,908 98.0% 92.6% 52.4 62,018 19,218 9,964 8,740 4,945 414 48,474,470 46,857,500 45,702,299 35,869,218 98.0% 93.9% 59.5 64,465 19,624 10,244 8,935 5,193 440 52,138,722 49,538,198 48,283,596 39,579,744 97.9% 93.2% 66.2 64,689 20,273 10,552 9,248 4,898 4 Table S3. The SNP quality and depth by NGS of each identified mutation Patient ID ARRP-49 ARRP-49 ARRP-206 RP:S-2 RP:S-10 RP:S-10 RP:S-14 RP:S-18 RP:S-18 RP:S-22 RP:S-34 RP:S-40 RP:S-48 Mutations c.8422G>A c.G7868A c.1871G>A c.8455delA c.4606C>G c.5038A>G c.9059T>C c.1418G>T c.2971C>T c.8388C>A c. 7187 G>C c.2259+1 G>A c. 3024 C>A SNP Quality 179 97 157 214 124 194 173 201 167 191 196 222 222 Total depth 45 56 101 99 78 101 106 86 57 138 62 75 98 Alternate depth 24 21 101 99 38 50 106 43 28 136 62 75 98 Table S4 The genotype of family ARRP-49 members Family member I:1 I:2 II:1 II:2 II:5 II:6 II:7 II:8 III:1 III:2 III:4 III:5 Sex M F M F M F F M F F M M Mutations +/MU2 MU1/+ MU1/MU2 +/+ MU1/MU2 +/+ MU1/MU2 +/+ +/MU2 MU1/+ +/MU2 MU1/+ III:6 III:7 M M MU1/+ MU1/+ *M: male; F: female;MU1:c.7868G>A, p.2623G>E; MU2: c.8422G>A, p.2808A>T; MU3: c.1871G>A, p.624S>L.