Exome-Sequencing, with Prior Linkage Analysis, Identifies a

advertisement

Exome-Sequencing, with Prior Linkage Analysis, Identifies a Candidate Gene Causing

Maturity-Onset Diabetes of the Young (MODY) in Thais Family

Ninareeman Binnima 1, *, Watip Tangjittipokin 1 , Kanchana Chanprasert 1 , Jatuporn Sujjitjoon 1 ,

Prapaporn Jungtrakoon

1

, Pa-thai Yenchitsomanus

2

, Nattachet Plengvidhya

3,

#

1

Department of Immunology and Immunology Graduate Program, Faculty of Medicine

Siriraj Hospital, Mahidol University, Thailand

2

Division of Molecular Medicine, Department of Research and Development, Faculty of

Medicine Siriraj Hospital, Mahidol University, Thailand

3

Division of Endocrinology and Metabolism, Faculty of Medicine Siriraj Hospital, Mahidol

University, Thailand

*en_nina@hotmail.com, #sinpv.natpl@gmail.com

Abstract

Maturity-onset diabetes of the young (MODY) is a monogenic form of diabetes. It is characterized by an early onset (usually before age 25 years), non-insulin dependence and autosomal dominant inheritance. Genetic factors play crucial roles in development of

MODY. Recently, MODY is caused by mutation in at least thirteen different genes including

HNF-4α, GCK, HNF-1α, IPF-1, HNF-1β, NeuroD1, KLF11, CEL, PAX4, INS, BLK, ABCC8 and KCNJ11.

However, MODY with unknown genetic etiology, or MODY-X, is still common in many ethnic groups. Because 80% of Thai MODY are MODY-X, this study aimed to identify novel MODY genes by linkage analysis and exome-sequencing. Linkage analysis was done in 27 members from a Thai MODY-X family. LOD score ≥ 2.5 was identified in chromosome 9, 11 and 22. Exome-sequencing was executed in two affected and two unaffected family members. Two novel variants, PTCH1 p.M1071L and GGT5 p.A48T, located in these high LOD regions were selected. Four in silico programs including PolyPhen

V2.0.23, SIFT, Mutation Taster and VarioWatch were used to predict the possible deleterious effect of these certain variants on protein functions. Only PTCH1 p.M1071L has deleterious effects. This variant was validated by Sanger sequencing and genotyped in family members.

PTCH1 p.M1071L was partially segregated with diabetes in this family. This variant was not detected in 200 non-diabetic controls and other 66 MODY-X probands. Interestingly, PTCH1 is expressed in pancreatic islets and involved in islets development and function. Therefore,

PTCH1 may be a pathogenic gene causing MODY in this family. Thus, linkage analysis combined with exome-sequencing is a new strategy to identify novel MODY genes

Keywords: maturity-onset diabetes of the young, linkage analysis, exome-sequencing,

MODY , diabetes

Introduction

Diabetes Mellitus (DM) is a complex metabolic disorder characterized by hyperglycemia resulting from defects in insulin secretion, insulin action, or both (1). It is one of the most common chronic diseases and its prevalence continues to rise significantly worldwide. Maturity-onset diabetes of the young (MODY) is a monogenic form of diabetes.

It is caused by defects of single genes which is important for pancreatic β-cells development or function. Therefore, genetic factor is crucial for development of MODY. MODY is characterized by an early onset, non-insulin dependence and autosomal dominant inheritance

(2). Recently, MODY is caused by mutation in at least thirteen different genes including

Hepatocyte Nuclear Factor-4α gene ( HNF-4α ) on chromosome 20q12-q13 (MODY1),

Glucokinase gene ( GCK ) on chromosome 7p15 (MODY2), Hepatocyte Nuclear Factor-1α gene ( HNF-1α ) on chromosome 12q24 (MODY3), Insulin Promoter Factor-1 gene ( IPF-1 ) on chromosome 13q12 (MODY4), Hepatocyte Nuclear Factor-1β gene ( HNF-1β ) on

chromosome 17q12-q21 (MODY5), Neurogenic Differentiation1 gene (NeuroD1) on chromosome 2q32 (MODY6), Kruppel-Like Factor 11 gene ( KLF11 ) on chromosome 2p25

(MODY7), Carboxyl-Ester Lipase gene ( CEL ) on chromosome 9q34 (MODY8), Paired Box

Gene 4 gene ( PAX4 ) on chromosome 7q32 (MODY9), Insulin gene ( INS ) on chromosome

11p15.5 (MODY10), Tyrosine Kinase, B-Lymphocyte Specific gene ( BLK ) on chromosome

8p23 (MODY11) (3), Atp-Binding Cassette, Subfamily C, Member 8 gene ( ABCC8 ) on chromosome 11p15.1 (MODY12) (4) and very recently Potassium Channel, Inwardly

Rectifying, Subfamily J, Member 11 gene ( KCNJ11 ) on chromosome 11p15.1 (MODY13)

(5). Although many subtype of MODY have been identified. The prevalence of MODY with unknown genetic etiology, or MODY-X, is still common in many ethnic groups including in

Thais which approximately 80% of MODY are MODY-X. Therefore, identification of gene causing MODY in Thais is interesting.

Linkage analysis is an effective method to localize linkage regions of chromosomes where disease loci may be harbored. It is suitable for investigating disease-associated loci in

Mendalian inheritance pedigree. However, this method is limited because significant linkage remains hard to establish and causal variant is difficult to identify when there are a large number of gene in linkage region. Recently, the advance in nucleic acid sequencing technology or Next Generation Sequencing (NGS) platform make exome-sequencing feasible for investigating causal variants for both Mendelian and complex diseases. This method is fast, has extensive coverage and cost-effective in identification of coding variants compare to

Sanger sequencing method. However, exome-sequencing require more filtering strategies for excluding more variants data. Exome sequencing is often preceded by genetic-linkage analysis, which allow variants outside of linkage peaks to be exclude, such as TNC was identified as a novel pathogenic gene in nonsyndromic hearing loss (6), CRYGD was identified as a causative mutation in a Chinese family with congenital cataract (7), RGS12,

GRPEL1, CLIC6 and WFS1 was identified as a novel causative genes for familial goiter (8) and many others.

In this study aimed to identify novel MODY genes in Thais family by linkage analysis and exome-sequencing. We expect that this research was primarily generating the novel pathogenic gene of MODY in Thais by using these approaches. Moreover, further research in molecular biology area will promote a better understanding of mechanism underlining the pathogenesis of MODY and lead to development of novel methods for therapeutic management of diabetes mellitus.

Methodology

Subject recruitment

Sixty-six MODY-X and the selected MODY-X family (M8) including 29 individuals, of which 19 affected with diabetes and 10 unaffected members, (Figure3) were recruited from diabetic clinic Siriraj Hospital, Mahidol University, Bangkok, Thailand according to the following criteria:(i) The proband and at least one first degree relative diagnosed with T2D before age 35;(ii) Two or more generations are affected by diabetes;(iii) Glycemic control can be accomplished with diet and/or oral agents;(iv) No history of diabetic ketoacidosis

(DKA);(v) Anti-glutamicaciddecarboxylase (GAD) antibody which is a marker for type1 diabetes is negative. Mutations of known MODY genes were investigated by Siriraj Diabetes

Research Group (SiDRG) (9). However, mutations in six known MODY genes were not identified.

Two hundred non-diabetic subjects were recruited from health check up facility,

Department of Prevention and Social Medicine, Siriraj Hospital, Mahidol University,

Bangkok, Thailand according to the following criteria: (i) age greater than 40 years;(ii) no family history of diabetes in first-degree relatives;(iii) not in end stage of hepatic and renal diseases;(iv) no autoimmune disease;(v) no hypertension;(vi) not in terminal state of cancer;

(vii) not receiving immunosuppressive or immune stimulant drugs;(viii) not receiving drugs that effect blood glucose level or lipid metabolism;(ix) fasting plasma glucose (FPG) less than 100 mg/dl;(x) glycosylate hemoglobin (HbA1c) less than or equal to 5.6%;(xi) 2-hr plasma glucose less than 140 mg/dl during an oral glucose tolerance test (OGTT).

Ethics statement

All subjects in this study were informed the purpose and extent of the study before signing a consent form of willing to participate prior to being enrollment in this study. The informed-consent procedures have been approved by Ethic Committee Faculty of Medicine

Siriraj Hospital Mahidol University. Relevant history, physical examination, anthropometric measurement, pedigree information and blood samples were collected

DNA extraction

DNA samples were extracted from white blood cells (WBCs) by using Standard phenol/chloroform method. DNA aliquots were reprecipitated to remove protein and fragment before used.

Genome-wide linkage analysis

Genome-wide linkage analysis was performed in 27 family members including 17 affected and 10 unaffected. DNA markers in each sample were genotyped by using Genome-

Wide Human SNP Array 6.0 (Affymetrix® Genome-Wide Human SNP Nsp/Sty 6.0) according to manufacturer protocol. A two-point parametric linkage analysis was performed by using SuperLink and the maximum LOD score would be obtained. Chromosome(s) with

LOD score ≥ 3 were selected. Subsequently, linkage region was determined in each selected chromosome. A multi-point parametric and nonparametric LOD scores in each identified linkage region were calculated by GeneHunter. Chromosomes showed highest LOD score from two-point and multi-point analyses at the same or neighbor region were chosen. These results were combined with exome data to identified candidate variants further.

Exome-sequencing

Exome-sequencing was performed in 2 affected, including proband (III3) and one relative with diabetes (III4), and 2 unaffected (I1, II3). Exome capture was performed by using the Agilent’s SureSelect Human All Exon 50 Mb kit according to the manufacture’s instruction. Captured library was then loaded onto the Illumina Hiseq2000 plateform for amplifying and sequencing. Sequence reads were mapped to the reference human genome

(UCSC NCBI37/hg19) by using the Burroughs-Wheeler Aligner (BWA). Variants detection was done by SAMtools. Detected variants were filtered to fit a quality and depth of variants.

Then, the variants were reported.

Candidate variants selection

Sequence data were filtered by using specific criteria including: (i) identified variants in only two affected were selected; (ii) Under the assumption that causal variants should alter protein sequences, nonsynonymous variants were included; (iii) Since MODY is a rare

Mendelian disease with an autosomal dominant inheritance, only heterozygousnonsynonymous variants with MAF ≤ 1% from 1000Genome Project and NIEHS Exome

Project data and not publish in dbSNP135 database were selected; (iv) variants located outside selected linkage region in each selected chromosome were excluded. The positions of non-excluded variants on map file, which was used in linkage analysis, were then investigated. Non-excluded variants locating at high LOD score position were selected and used to study further for effects on protein expression and function by in silico programs

(PolyPhen V2.0.23, SIFT, Mutation Taster and VarioWatch) to help select proper candidate variants and those showing deleterious effect on protein function in at least 3 out of 4 programs were selected.

Candidate variants validation by Sanger sequencing

Candidate variant(s) were validated by Sanger sequencing. Oligonucleotide primers were designed by using Omiga program version 2.0 (Oxford Molecular Ltd., Oxford, United

Kingdom). PCR products were electrophoresed on 1.5% agarose gel and purified by using

QIAquick PCR purification kit (QIAGEN, Hilden, Germany). Then, Sanger sequencing was performed and false positive variants were excluded. Selected candidate variant(s) were used to analyse the segregation of variants with diabetes in family.

Segregation analysis of diabetes in family

Polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) was used to investigate the segregation of candidate variants with diabetes in the family.

Segregated variant(s) were used to investigate MAF in Thais further. Restriction enzymes for genotyping in each candidate variants were determined by using BioEdit version 7.0.9.0 or

NEBcutter (http://tools.neb.com/NEB-cutter2/).

Minor allele frequency (MAF) analysis

Segregated variant(s) were used to investigate MAF in 200 non-diabetic controls and

66 MODY probands by using PCR-RFLP method.

Results

Genome-wide Linkage analysis

Twenty-seven family members were genotyped by Genome-Wide Human SNP Array

6.0. Generated genotype data was then used to calculate two-point and multi-point linkage analysis. For two-point analysis, chromosome(s) with LOD score ≥ 3 were not identified.

However, 11 chromosomes (1, 4, 5, 6, 8, 9, 10, 11, 13, 16 and 22) with LOD score ≥ 2.5 were found. These chromosomes were determined linkage regions and used to investigate multipoint linkage analysis by using GeneHunter. For multi-point analysis, only 3 chromosomes

(9, 11 and 22) presenting highest LOD score from multi-point analyses at the same or neighbor region with two-point analysis were selected. Three candidate regions including q22.1-q32, p14.3-q12.2 and q11.22-q12.3 on chromosome 9, 11 and 22 respectively (Figure

1) were used to combine with exome data for searching causative variants further.

Exome-sequencing analysis

Four family members including two affected (III3, III4) and two unaffected (I1, II3) were performed exome-sequencing. The average of raw sequencing data was 6.02 Gb. The average read length was 101 bp. After mapping to human reference genome (UCSC Genome

Browser hg19), approximately 82.48% of total reads mapped on the human genome and approximately 71.50% of mappable reads mapped on target regions. Therefore, the average of target sequencing data was 62 Mb. On average, 82.70% of target regions were covered to a read depth of at least 10X. Mean read depth of target regions was approximately 44.73X. The average number of total SNPs and coding SNPs was 68,817 and 19,464 respectively.

Candidate variants selection and validation

Since numerous variants were detected from exome-sequencing. Therefore, specific filtering criteria were used to focus on variants of interest. By using filtering criteria, 2,744 identified variants in two affected and not presented in two unaffected were selected. Under the assumption that causal variants should alter protein sequence, 343 nonsynonymous variants were then chosen. Since MODY is a rare Mendelian disorder with an autosomal dominant inheritance, 57 heterozygous-nonsynonymous variants presenting MAF ≤ 1% from

1000Genome Project and NIEHS Exome Project data and not publish in dbSNP135 database were then selected. Since numerous variants were detected although specific filtering criteria were used. Therefore, three candidate loci from linkage analysis were used to exclude these variants. In these 3 candidate loci, five variants were included (Table1). These non-excluded variants were then searched the position of variants on map file which was used in linkage analysis. Interestingly, there were 2 novel variants ( PTCH1 p.M1071L and GGT5 p.A48T) located on the high LOD score position (Figure 1). Those 2 variants were studied further to

investigate the effects of variants on protein function by in silico programs (PolyPhen

V2.0.23, SIFT, Mutation Taster and VarioWatch). Only PTCH1 p.M1071L presenting deleterious effect on protein function in every program was selected (Table2). We then performed Sanger sequencing to validate exome-sequencing result and true positive sequence was presented (Figure 2).

Segregation analysis in family

This variant was genotyped by PCR-RFLP method to determine the segregation of candidate variants with diabetes in this family. Partial segregation of PTCH1 p.M1071L with diabetes in this family was identified (Figure 3).

Minor allele frequency (MAF) analysis

Since MODY is a rare Mendelian disease. Therefore, MAF was also investigated in

200 non-diabetic controls and other 66 MODY probands. This variant was not observed in all non-diabetes controls and other MODY probands (Table3). We suggest that this variant is a rare variant in Thais population.

Discussion and conclusion

In the present study, we investigated pathogenic genes of MODY in Thais family by combining linkage analysis and exome-sequencing. A candidate variant in PTCH1 which located in linkage region (9q22.1-9q32) was identified. It is a rare variant (MAF<1%) and has not been reported to be a pathogenic gene of MODY. This variants was detected in 10 family members including 9 affected (II1, II6, II7, II9, II10, II16, II17, III3, III4) and 1 unaffected (III7). For unaffected member (III7) carrying this variants, he was collected blood sample and tested blood glucose when he was 4 years old. Therefore, he may have a risk to get diabetes later. However, there were 10 more affected which did not carry this variant in this family. Since grandmother (I2) is a founder which get diabetes in this family. Therefore, pathogenic gene causing diabetes of two affected members (II12 and III6) may receive from sample I2. Other affected non-founders (I3, II4, II5, II8, II19, and III1) who did not carry this variant were probably phenocopies as observed in the case of other MODY genes (10, 11).

PTCH1 , which encodes Patched 1 protein, a twelve pass transmembrane receptor for sonic hedgehog (SHH), indian hedgehog (IHH) and desert hedgehog (DHH) is located at

9q22.32. Hedgehogs (Hhs) is a intercellular signaling molecule which regulate the mammalian organ development, differentiation and function in embryos and adults (12).

Binding of Hhs protein to PTCH1 receptor results in alleviation of the inhibition of the

Smoothened (Smo), which is a seven pass transmembrane protein of the G protein-coupled receptor, and then leads to translocation of Gli transcription factor into the nucleus and activation of transcription of target genes, respectively. The expression of this protein in pancreatic islet has been reported. The studies of Ptch1 mutation have been reported.

Homozygous Ptch1 mutant mice inhibit the expression of two pancreatic markers including

Pdx-1 and glucagon (13). These mice died during early development. However, heterozygous

Ptch1 mutant mice survive to adulthood and were used to study physiologic change in blood glucose homeostasis. These mice had impaired glucose tolerance (13). Therefore, defect of

Hg signaling in pancreatic tissue may reduce Pdx-1 expression lead to abnormal glucose homeostasis. Deleterious effect of PTCH1 p.M1071L variants on protein function was shown in 4 in silico program. Moreover, conservation of the target amino acid in all selected species was found (Figure 3). Therefore, we hypothesize that mutation in PTCH1 can cause diabetes by affecting on β-cell development and insulin gene expression.

In conclusion, using linkage analysis and exome sequencing enabled us to identify

PTCH1 acting as a candidate gene of MODY in this family. Therefore, this research may be a primary study for finding the candidate gene causing MODY in Thais by linkage analysis and exome-sequencing. However, molecular biology of this variant should be studied further to

promote a better understanding of mechanism underlining the pathogenesis of MODY and may lead to development of novel methods for therapeutic management of diabetes mellitus.

Reference

1. Assoc AD. Diagnosis and Classification of Diabetes Mellitus. Diabetes Care 2011;34:S62-S9.

2. Winter WE, Nakamura M, House DV. Monogenic diabetes mellitus in youth. The MODY syndromes.

Endocrinol Metab Clin North Am 1999;28(4):765-85.

3. Fajans SS, Bell GI. MODY: history, genetics, pathophysiology, and clinical decision making. Diabetes

Care 2011;34(8):1878-84.

4. Bowman P, Flanagan SE, Edghill EL, Damhuis A, Shepherd MH, Paisey R, et al. Heterozygous ABCC8 mutations are a cause of MODY. Diabetologia 2012;55(1):123-7.

5. Bonnefond A, Philippe J, Durand E, Dechaume A, Huyvaert M, Montagne L, et al. Whole-Exome

Sequencing and High Throughput Genotyping Identified KCNJ11 as the Thirteenth MODY Gene. PLoS

One 2012;7(6):e37423.

6. Zhao Y, Zhao F, Zong L, Zhang P, Guan L, Zhang J, et al. Exome sequencing and linkage analysis identified tenascin-C (TNC) as a novel causative gene in nonsyndromic hearing loss. PLoS One

2013;8(7):e69549.

7. Jia X, Zhang F, Bai J, Gao L, Zhang X, Sun H, et al. Combinational analysis of linkage and exome sequencing identifies the causative mutation in a Chinese family with congenital cataract. BMC Med

Genet 2013;14(1):107.

8. Yan J, Takahashi T, Ohura T, Adachi H, Takahashi I, Ogawa E, et al. Combined linkage analysis and exome sequencing identifies novel genes for familial goiter. J Hum Genet 2013;58(6):366-77.

9. Plengvidhya N, Boonyasrisawat W, Chongjaroen N, Jungtrakoon P, Sriussadaporn S, Vannaseang S, et al.

Mutations of maturity-onset diabetes of the young (MODY) genes in Thais with early-onset type 2 diabetes mellitus. Clin Endocrinol (Oxf) 2009;70(6):847-53.

10. Yamagata K, Oda N, Kaisaki PJ, Menzel S, Furuta H, Vaxillaire M, et al. Mutations in the hepatocyte nuclear factor-1alpha gene in maturity-onset diabetes of the young (MODY3). Nature

1996;384(6608):455-8.

11. Stoffers DA, Ferrer J, Clarke WL, Habener JF. Early-onset type-II diabetes mellitus (MODY4) linked to

IPF1. Nat Genet 1997;17(2):138-9.

12. Hammerschmidt M, Brook A, McMahon AP. The world according to hedgehog. Trends Genet

1997;13(1):14-21.

13. Hebrok M, Kim SK, St Jacques B, McMahon AP, Melton DA. Regulation of pancreas development by hedgehog signaling. Development 2000;127(22):4905-13.

Acknowledgements: This work was supported by Siriraj Research Development Grant,

Faculty of Medicine Siriraj Hospital, Mahidol University Grant (to NP), and Research Career

Development Grant from Thailand Research Fund (TRF) (to NP). NP was supported by the

Office of the Higher Education Commission and Mahidol University under the National

Research Universities Initiative. WT was supported by Mahidol University grant. NB was supported by Siriraj Graduate Thesis Scholarship. JS was supported by the joint funding of

Thailand Research Fund (TRF)-Royal Golden Jubilee PhD Scholarship and Mahidol

University. PY is a TRF-Senior Research Scholar.

Figure1 Linkage analysis result from two-point and multipoint of chromosome 9, 11 and 22.

Two-point parametric, multi-point parametric and multi-point nonparametric results are represented at the first, second and third row, respectively. Linkage analysis result of chromosome 9, 11 and 22 are represented in first, second and third column, respectively. These chromosomes represented same or neighbor high LOD score position from two-point and multi-point analysis. Moreover, when we combined linkage and exome sequencing result and then investigated candidate variants position on map file used for linkage analysis. Interestingly, there were 2 novel candidate variants ( PTCH1 p.M1071L and GGT5 p.A48T) located at high LOD score position.

The dark arrow indicates high LOD score and candidate variants position.

Figure2 Representation of PTCH1 p.M1071L position on chromosome 9 (a), homology of PTCH1 p.M1071L within selected species (b) and Sanger sequencing (c).

Figure 3 Pedigree of MODY-X family (M8).

Roman numerals on the left site indicated the generation numbers of this pedigree and the red numbers at the lower of each symbol indicated individual ID within this generation. Males are indicated by squares and females are indicated by circles . Filled, half-filled square, half-filled triangle and opened symbols represent diabetic, increase-risk (HbA1c ≥ 5.7), impaired fasting glucose (IFG) and non-diabetic subjects, respectively. Proband is indicated by the arrow . Green stars at the upper left of symbols indicated the samples which were performed by

NGS. Ages of individual samples are shown at the upper right of the symbols. Genotyping result of PTCH1 p.M1071L is indicated at the second row under the symbol. AA is a homozygote wild type and AT is a heterozygote mutant. Age at diagnosis (Dx) is represented at the third row under the symbol. NA is not available of age at diagnosis.

Table 1Number of candidate variants after reduction by selection criterias.

Selection criteria

Identified variants in only two affected samples

Nonsynonymous variants

Heterozygous Nonsynonymous variants

Heterozygous Nonsynonymous variants -MAF ≤ 1%

Heterozygous Nonsynonymous variants -MAF ≤ 1%-Novel (no dbSNP135)

Variants locate inside 3 candidate loci from linkage analysis

Table 2 Representation of variants prediction by 4 in silico programs

Number of variants

2,744

343

282

111

57

5

Gene

PTCH1

GGT5

Annotation MutationTaster p.M1071L p.A48T

Disease Causing

VarioWatch

High risk

Disease Causing Low risk

PolyPhen2

Probably damaging

Possibly damaging

SIFT

Deleterious

Tolerated

Table 3 Genotyping result of PTCH1 p.M1071L by PCR-RFLP in 200 non-diabetic controls and other 66

MODY-X probands.

Allele frequency

Nucleotide

Genotype frequency

Gene Designation MODY Controls MODY Controls change

(n=66) (n=200) (n=66) (n=200)

PTCH1 GAT>GTT M1071L

A/A

66

1.00

A/T

0

0.00

T/T

0

0.00

A/A

200

1.00

A/T

0

0.00

T/T

0

0.00

A 1.00

T 0.00

A 1.00

T 0.00

Download