Supplemental Material Supplemental Methods Kv7.1 Ortholog and

advertisement
Supplemental Material
Supplemental Methods
Kv7.1 Ortholog and Paralog Conservation Analysis
For inter-species (ortholog) conservation analysis, the University of California Santa Cruz (UCSC)
Genome Browser (http://genome.ucsc.edu/) alignment of 43 species including primates, other placental
mammals, monotremes and non-mammalian vertebrates to the primary human Kv7.1 sequences was used.
Additionally, the sequences of the five human paralogs from the Kv7 family (KCNQ1-encoded Kv7.1
[UniProt: P51787], KCNQ2-encoded Kv7.2 [UniProt: O43526], KCNQ3-encoded Kv7.3 [UniProt:
O43525], KCNQ4-encoded Kv7.4 [UniProt: P56696], KCNQ5-encoded Kv7.5 [UniProt: Q9NR82]) were
obtained from UniProt and aligned using ClustalW (http://www.ebi.ac.uk/Tools/msa/clustalw2/). As
sequences within this family have regions of low similarity, each individual region (i.e. N-terminus, S1,
S1/2, S2, S2/3, S3, S3/4, S4, S4/5, S5, S5/6, S6, C-terminus) was aligned individually and subsequently
assembled into a continuous sequence relative to the Kv7.1 protein sequence.
For each of the amino acid residues of the human Kv7.1 protein, the percent conservation for both
orthologs and paralogs was calculated as the number of orthologs or paralogs hosting the human Kv7.1
residue over the total aligned residues (unaligned orthologs and paralogs at each residue were not used in
the calculation). For example, 26 of the 43 species were aligned at amino acid position 350 (17 species
had gaps at this position). All 26 aligned orthologs hosted the amino acid residue Glycine (G) at position
350 thus indicating a 100% conservation of this amino acid residue at this position. In contrast, amino
acid position 477 only had 10/25 (40%) aligned species hosting the human Kv7.1 amino acid Proline (P).
Phenotype Prediction Analyses
For the phylogenetic classifications, genetic variants were classified as occurring at a position with either
no substitutions or > 1 substitution(s) in the orthologs or paralogs. Variants at positions with > 1
substitution were classified as benign and variants with 0 substitutions were considered pathogenic.
In order to assess the physicochemical properties of rare variants, Grantham chemical scores were
calculated using the Grantham amino acid difference matrix as previously described. 1 Grantham values
range from 15 (most conservative) to 215 (most radical), with values  150 considered radical, 100 to 149
considered moderately radical, 50 to 99 considered moderately conservative, and < 50 considered
conservative. For the purposes of this study, rare variants were classified as radical (Grantham value 
100) or conservative (Grantham value < 100).
SIFT, version 4.0.5, an additional conservation-based metric was used to analyze the Nav1.5
protein sequence and provide phenotype predictions for each rare variant identified in cases and controls
using the default settings. The assumptions and exact methodology employed by the current version of
the SIFT algorithm have been described previously.2 For the purposes of this study, rare variants were
classified as either “Tolerated” or “Damaging” based on the SIFT prediction.
PolyPhen2, version 2.1.0, was used to analyze the effect of rare variants on the secondary and
tertiary protein structure of the Nav1.5 channel using information derived from the Protein Databank
(PDB) and Database of Secondary Structure Assignments (DSSP) using default settings. The assumptions
and exact methodology of the PolyPhen2 algorithm have been described previously.3 PolyPhen2
classified each variant as “probably damaging”, “possibly damaging”, or “benign”. For this study, those
rare variants labeled as “probably damaging” or “possibly damaging” were combined as “damaging”.
KvSNP provides predictions based on a machine learning classifier optimized for Kv channel
SNPs. The assumptions and exact methodology employed by the KvSNP algorithm have been described
previously.4 KvSNP classified each variant based on a probability of disease causation from 0 to 1. If this
probability is equal to, or exceeds, 0.5 then the variant is predicted to be disease-causing, otherwise it is
predicted to be benign.
MutPred, version 1.2, provides phenotypic classifications conservation parameters adapted from
the SIFT algorithm as well as 14 different structural and functional properties. The assumptions and exact
methodology of the MutPred algorithm have been described previously.5 Each variant was scored with
either “very confident hypotheses,” “confident hypotheses,” “actionable hypotheses,” or “benign” by the
MutPred predictions. For this study, those rare variants classified as “benign” were considered benign,
while the remaining classifications were grouped as “pathogenic”
Calculation of Estimated Prediction Values
In order to estimate the likelihood of disease causation, an estimated predictive value (EPV, defined as the
probability of pathogenicity for a mutation identified in a case; EPV = (case frequency – control
frequency)/case frequency) was employed.6 Briefly, these calculations rely on the simplifying
assumptions that (1) the rate of background genetic variation is the same for the case and control
populations and (2) all mutations found in controls are benign, background mutations, given the low
prevalence of KCNQ1 c-terminus mediated LQTS.
Applying these principles, we then calculated estimated predictive values (EPVs). The upper and
lower bounds of the 95% confidence intervals (95% CI) were calculated for all EPVs using the formula:
CI=1−1/(ê{ln (RR)±z*[SE(log RR)]}), where RR is the relative ratio (mutation frequency in cases
divided by the mutation frequency in controls), z=1.959964 for 1−α=95%, and SE[log(RR)] is the
standard error around the log of RR. All EPVs calculated here are specific to clinically definite cases as
defined previously above and would be over-estimates if applied to a less definite case.
1.
2.
3.
4.
5.
6.
Grantham R. Amino acid difference formula to help explain protein evolution. Science.
1974;185:862-864
Kumar P, Henikoff S, Ng PC. Predicting the effects of coding non-synonymous variants on protein
function using the sift algorithm. Nat Protoc. 2009;4:1073-1081
Ramensky V, Bork P, Sunyaev S. Human non-synonymous snps: Server and survey. Nucleic Acids
Res. 2002;30:3894-3900
Stead LF, Wood IC, Westhead DR. Kvsnp: Accurately predicting the effect of genetic variants in
voltage-gated potassium channels. Bioinformatics. 2011;27:2181-2186
Li B, Krishnan VG, Mort ME, Xin F, Kamati KK, Cooper DN, Mooney SD, Radivojac P. Automated
inference of molecular mechanisms of disease from amino acid substitutions. Bioinformatics.
2009;25:2744-2750
Kapa S, Tester DJ, Salisbury BA, Harris-Kerr C, Pungliya MS, Alders M, Wilde AA, Ackerman MJ.
Genetic testing for long-qt syndrome: Distinguishing pathogenic mutations from benign variants.
Circulation. 2009;120:1752-1760
Supplemental Table 1
Nucleotide Change
c.1046C>G
c.1045T>C
c.1048G>A
c.1048G>C
c.1052T>C
c.1058T>C
c.1061A>G
c.1070A>G
c.1079G>T
c.1079G>C
c.1085A>G
c.1093A>C
c.1096C>T
c.1097G>A
c.1097G>C
c.1111G>A
c.1115C>A
c.1117T>C
c.1121T>A
c.1135T>G
c.1136G>C
c.1140G>T
c.1142G>A
c.1153G>A
c.1165T>C
c.1166C>A
c.1172C>T
c.1174T>C
c.1179G>T
c.1178A>T
c.1189C>T
c.1190G>A
c.1193A>G
c.1222C>G
c.1249G>A
c.1283A>G
c.1321C>T
c.1338C>G
c.1336G>A
c.1340C>A
c.1343C>G
c.1343C>T
Mutation
p.S349W
p.S349P
p.G350R
p.G350R
p.F351S
p.L353P
p.K354R
p.Q357R
p.R360M
p.R360T
p.K362R
p.N365H
p.R366W
p.R366Q
p.R366P
p.A371T
p.A372D
p.S373P
p.L374H
p.W379G
p.W379S
p.R380S
p.C381Y
p.E385K
p.S389P
p.S389Y
p.T391I
p.W392R
p.K393N
p.K393M
p.R397W
p.R397Q
p.K398R
p.P408A
p.V417M
p.D428G
p.P441S
p.D446E
p.D446N
p.P447H
p.P448R
p.P448L
Helix
Helix A
Helix A
Helix A
Helix A
Helix A
Helix A
Helix A
Helix A
Helix A
Helix A
Helix A
Highly Conserved Region?
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
No
No
No
No
No
No
No
No
No
No
No
No
No
No
Status
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Control
Case
Case
Case
Case
Case
Control
Case
Control
Rare Control
Case
Control
Case
Rare Control
Control
Case
Rare Control
Rare Control
Control
Case
Nucleotide Change
c.1345G>A
c.1348G>A
c.1352G>A
c.1351C>T
c.1355G>A
c.1354C>T
c.1378G>A
c.1388G>C
c.1430C>T
c.1498A>C
c.1520G>A
c.1531C>T
c.1553G>A
c.1552C>G
c.1553G>C
c.1556G>A
c.1555C>T
c.1559T>G
c.1565A>C
c.1571T>G
c.1573G>A
c.1574C>T
c.1576A>G
c.1597C>T
c.1615C>T
c.1616G>A
c.1621G>A
c.1627G>A
c.1637C>T
c.1640A>G
c.1643G>A
c.1661T>C
c.1664G>A
c.1663C>T
c.1663C>A
c.1669A>G
c.1685G>T
c.1697C>T
c.1697C>A
c.1696T>C
c.1700T>G
c.1700T>C
Mutation
p.E449K
p.E450K
p.R451Q
p.R451W
p.R452Q
p.R452W
p.G460S
p.S463T
p.P477L
p.I500L
p.R507Q
p.R511W
p.R518Q
p.R518G
p.R518P
p.R519H
p.R519C
p.M520R
p.Y522S
p.V524G
p.A525T
p.A525V
p.K526E
p.R533W
p.R539W
p.R539Q
p.V541I
p.E543K
p.S546L
p.Q547R
p.G548D
p.V554A
p.R555H
p.R555C
p.R555S
p.K557E
p.R562M
p.S566F
p.S566Y
p.S566P
p.I567S
p.I567T
Helix
Helix B
Helix B
Helix B
Helix B
Helix B
Helix B
Helix B
Helix B
Helix B
Helix B
Helix B
Helix B
Helix B
Helix C
Helix C
Helix C
Helix C
Helix C
Helix C
Helix C
Highly Conserved Region?
No
No
No
No
No
No
No
No
No
No
No
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Status
Control
Control
Control
Case
Control
Control
Control
Control
Case
Rare Control
Rare Control
Case
Control
Case
Case
Rare Control
Case
Case
Case
Case
Case
Case
Control
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Case
Nucleotide Change
c.1702G>A
c.1703G>C
c.1705A>G
c.1712C>T
c.1719C>A
c.1738A>G
c.1747C>T
c.1748G>A
c.1750G>A
c.1756A>G
c.1760C>T
c.1766G>A
c.1768G>A
c.1771C>T
c.1772G>A
c.1781G>A
c.1781G>C
c.1786G>A
c.1799C>T
c.1831G>A
c.1831G>T
c.1855T>A
c.1861G>A
c.1876G>A
c.1903G>A
c.1927G>A
c.1942G>A
c.1973C>A
c.1987G>A
Mutation
p.G568R
p.G568A
p.K569E
p.S571L
p.F573L
p.S580G
p.R583C
p.R583H
p.G584S
p.N586D
p.T587M
p.G589D
p.A590T
p.R591C
p.R591H
p.R594Q
p.R594P
p.E596K
p.T600M
p.D611N
p.D611Y
p.L619M
p.G621S
p.G626S
p.G635R
p.G643S
p.V648I
p.T658N
p.E663K
Helix
Helix D
Helix D
Helix D
Helix D
Helix D
Helix D
Helix D
Helix D
Helix D
Helix D
Helix D
Helix D
Highly Conserved Region?
Yes
Yes
Yes
Yes
Yes
No
No
No
No
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
No
No
No
No
No
No
No
No
No
Status
Case
Case
Case
Case
Case
Rare Control
Case
Case
Rare Control
Case
Case
Case
Case
Case
Case
Case
Case
Case
Control
Control
Case
Case
Rare Control
Case
Case
Control
Control
Rare Control
Rare Control
Download