Supplementary Material (doc 136K)

advertisement
Supplement material and methods, tables and figures
SLC6A4 allelic expression imbalance
MATERIALS AND METHODS
Genotyping
For the 5HTTLPR, two oligonucleotide primers (5′-GGCGTTGCCGCTCTGAATGC-3′, 5′GAGGGACTGAGCTGGACAACCCAC-3′) were used to generate 5HTTLPR allele-specific
fragments (484 base pairs [bp] and 528 bp) (Lerman et al. 1998) by polymerase chain reaction
(PCR). The PCR products were separated on 4% agarose gel electrophoresis, stained with
ethidium bromide and visualized under UV light. The PCR was performed in a 10 μL reaction
including 10 ng DNA, 1 μL NH4 buffer, 0.25 μL dNTPs (40 mM), 2.35 μL 2× Polymate
(BioLine, London, United Kingdom), 0.3 μL MgCl2 (50 mM), 1 μL primers (10 μM) and 0.1 μL
BioTaq DNA Polymerase (BioLine). The PCR consisted of a 5 min denaturing step at 95°C (1
cycle), then 95°C for 30 sec, 62°C for 45 sec, 72°C for 1 min (35 cycles), and finally 72°C for 4
minutes (1 cycle). The genotyping of the A/G SNP within the 5HTTLPR long allele (Hu et al.
2006) was obtained by incubating 10 μL of PCR products with 4 units of restriction enzyme
MspI which recognizes the restriction site (CCGG) created by the G allele. The fragments for the
A allele (62, 126, 340 bp) and the G allele (62, 126, 166 and 174 bp) were separated on 3%
agarose gel, stained with ethidium bromide and visualized under UV light. The two SNPs,
rs2020933 and rs8073965, were designed and genotyped using the Sequenom platform as
1
described below for the cDNA allelotyping, with approximately 90% success rate. Primer details
are available on request.
Sequencing
The Ensembl web database was used to obtain a copy of the SLC6A4 gene sequence. Primer3
was used to design primers that covered the promoter and all 15 exons of the gene. DNA was
amplified in a 50l PCR reaction, with 10 pmol of oligonucleotides, 100 ng of DNA, 0.2 units of
Taq Gold, 8 mM dNTP, 8 mM 1
PCR buffer, and 25 mM MgCl2. PCR products were purified
in a 96-well Millipore purification plate and resuspended in 30 l of H2O. Two sequencing
reactions were prepared for each DNA sample, one with the forward primer and one with the
reverse primer. The PCR reagents were removed from solution by an ethanol precipitation in the
presence of sodium acetate. All sequencing reactions were run out on anABI3700 sequencer and
assembled by using PHRED/PHRAP and visualised using the Consed program.
Measure of allelic expression imbalance
Thirty three lymphoblastoid cell lines of Centre d'Etude du Polymorphisme Humain (CEPH)
individuals were obtained from Coriell Cell Repositories (Camden, NJ) and were grown in RMPI
1640 medium supplemented with 15% fetal calf serum, 1% L-glutamine, 1% penicillinstreptomycin in an incubator set to 37 °C with 5% carbon dioxide. RNA was isolated using
RNeasy midi kit (Qiagen). RNA samples were treated with the TURBO DNA-freeTM kit
(Ambion, Austin, Texas) according to the manufacturer's recommendations, to eliminate possible
contamination from genomic DNA (gDNA). cDNA was prepared using 50 ng of random primers
and SuperSciptTM III RNA transcriptase (Invitrogen) with 1g of RNA according to the
2
manufacture protocol. To remove the RNA, the cDNA was treated with RNase H (Invitrogen).
The cDNA was tested for DNA contamination using primers (GTCGTTTGAAGCCAGGAGAT
and GGCTGAATGTTGTCGGATTT) within the coding region of an unrelated gene (STXPB4),
which gives a different amplicon length in the presence of gDNA contamination (159 bp for
cDNA and 339 bp for gDNA).
Heterozygote individuals for the transcribed SNP (rs1042173) were used to measure the
relative expression of the alleles within each mRNA sample. Genomic DNA of heterozygotes
was used to correct for unequal readings of the alleles in the assay. For each cell line we
generated cDNA in two separate reactions which were allelotyped separately. Each cDNA was
allelotyped twelve times together with four samples of genomic DNA of the same cell line.
Assays for the PCR and associated extension reaction were designed by SpectroDESIGNER
software
(Sequenom,
San
Diego,
CA).
The
PCR
primers
ACGTTGGATGGCAGCACATGGATTAGAAGG
ACGTTGGATGAGAACAGGGATGCTATCTCG and
are
and
the
extension
primer
is
AGTAGATTCCAGCAATAAAATT. PCRs were performed in 10 l reactions with final
concentrations of 2.5 mM MgCl2, 200 M dNTPs, 0.2 U of HotStar Taq (Qiagen), and primer
concentration of 0.2 mM. The PCR profile is 45 cycles of 20 s at 95°C, 30 s at 56°C, and 1 min
at 72°C. Non-incorporated dNTPs were removed with shrimp alkaline phosphatase for 20 min at
37°C.
The
mass-extension
reaction
was
performed
using
MassEXTEND
enzymes
thermosequenase (Amersham Pharmacia), homogenous MassEXTEND (hME) termination
mixes, and hME extension primers; 55 cycles were performed for 5 s at 94°C, for 5 s at 52°C,
and for 5 s at 72°C. Unincorporated ddNTPs and dNTPs were removed with SpectroCLEAN
resin, and products were transferred to a 384 SpectroCHIP (Sequenom, San Diego, CA) using
3
SpectroPOINT robot (Sequenom, San Diego, CA). The chip was read using the Bruker Autoflex
Mass Spectrometer system (Bruker-Sequenom, San Diego, CA). The allelotyping was analyzed
using MassARRAY Typer version 3.1 software (Sequenom). Peak areas were used to calculate
the allele frequencies.
Allelic expression imbalance analysis
The expression ratios between the G and T allele of SNP rs1042173 were corrected for unequal
detection using the average ratio in genomic DNA from heterozygotes (mean ratio = 1.18). The
expression ratios were analysed on the log (base 10) scale because the distributions of the log
ratios tend to be closer to a normal distribution. The presence of a significant AEI in each
genotyping group was determined by a Student's t test of whether the mean of the log
transformed values was different from 0. The association mapping analysis was based on the
expectation that in the case of one cis-acting variant, the heterozygotes should show AEI (i.e.
allelic ratio significantly different from one) but not the homozygotes (allelic ratio close to 1).
Under the null hypothesis (that the SNPs are not functional and not in LD with any other
functional SNP) the distribution of allelic ratios should be equal in the heterozygote and the
homozygote groups (Table 1). For each of the tested SNPs, the significance of the differences
between the homozygotes and the heterozygotes log transformed allelic ratios was tested by a
Wilcoxon rank-sum test using the 'wilcox.test' function in the R statistical analysis package
version 2.1.1. (R Development Core Team 2004). The alleles of the 5HTTLPR were phased
relative to other SNPs using the program PHASE2 (Stephens et al. 2001). There are four possible
haplotype pairs (diplotypes): homozygotes at the tested SNP for one or the other allele, and
heterozygotes in two possible ways (Table 1). For SNPs creating two possible heterozygous
4
diplotypes, the association test was performed twice after inverting the allelic ratios of one
subgroup or the other. For example, if an A/B SNP and the transcribed SNP (rs1042173; T/G)
had two types of diplotypes: T-A/G-B and G-A/T-B. In this case, the maximum P-value of the
two was recorded. Similarly, the proportion of variance in AEI explained by the heterozygote vs.
homozygote genotypes of the 5HTTLPR, rs16965628 and rs2020933 variants was estimated
twice using analysis of variance (ANOVA), recording the maximum proportion of the two
estimates. The 5% threshold for significance was empirically evaluated by randomly permuting
the allelic ratios 100,000 times, applying the above test and recording the minimum P-value
across all SNPs.
5
Table 1. Four possible diplotypes with the expected AEI outcome
Expected allelic expression ratio
Haplotype
between T/G
Transcribed
SNP a
Tested SNP b
Functional SNP: A > B c
(rs1042173)
T
G
---------
A
A
balanced: ratio = 1
T
G
---------
B
B
balanced: ratio = 1
T
G
---------
A
B
imbalanced: ratio>1
T
-----
B
imbalanced: ratio<1
G
----A
a
The transcribed SNP is used as a tag to measure the relative abundance of allelic
transcripts.
b
The tested SNP is analysed for association with AEI.
c
Functional SNP with a higher expression of the A allele relative to the B allele.
Association analysis with neuroticism
A detail description of the subjects used for the association analysis has been published
elsewhere 22, 23. This study was approved by the Oxford Local Ethical Review Committee, and
informed consent was obtained from all participants. We have previously collected N scores
from 88,142 individuals from an ethnically homogenous population from the South West and
South East of England 23. We identified 768 unrelated individuals from the extremes (10%) of
the N-score distribution that were previously genotyped for the 5HTTLPR 22. We have estimated
27
that the sample has sufficient power (80%) to detect a genetic effect contributing only 0.53%
of phenotypic variance at a 1% alpha level, assuming no dominance effect and a QTL increaser
6
allele frequency of 5%. Allele frequencies were compared between the high and low neuroticism
score extreme groups using Fisher exact test as implemented in R ('fisher.test'). The haplotypes
were analyzed based on a score statistic using "haplo.stats" package for R 24.
7
Table2. Linkage disequilibrium in the 33 CEPH sample between the variants
associated with allelic expression imbalance
a
Variant 1 (MAF a)
Variant 2
r2
D'
rs16965628 (7.6%)
5HTTLPR
0.019
0.54
rs2020933 (6.1%)
rs16965628
0.79
1.00
5HTTLPR (44%)
rs2020933
0.0094
0.43
minor allele frequency in the 33 CEPH samples
8
Table 3. Genotype and allele distribution
High
N
Low
N
rs8073965
counts
frequency
rs2020933
counts
frequency
GG
382
0.953
TT
391
0.914
GT
19
0.047
AT
36
0.084
TT
0
0.000
AA
1
0.002
G
783
0.976
T
818
0.956
T
19
0.024
A
38
0.044
GG
257
0.948
TT
261
0.894
GT
14
0.052
AT
31
0.106
TT
0
0.000
AA
0
0.000
G
528
0.974
T
553
0.947
T
14
0.026
A
31
0.053
High N = individuals with high neuroticism score, Low N = low neuroticism
score.
9
Table 4. Haplotype association with neuroticism tested using a haplotype score test
rs2020933
a
5HTTLPR
Haplotype frequencies
Total
Low N
High N
Haplotype
scorea
Simulated
P-valueb
A
L
0.044
0.047
0.042
-0.67
0.50
A
S
0.0041
0.0066
0.0030
-0.71
0.54
T
L
0.55
0.56
0.54
-0.45
0.65
T
S
0.41
0.39
0.42
0.81
0.42
Haplotype score statistics were calculated using the Haplo.Stats package in R
b
The simulated P-value for the maximum score statistic is 0.81
10
Genomic DNA log10 ratios
Figure
-0.4
-0.2
0
0.2
0.4
0.4
0.2
0
-0.2
-0.4
Frequency
cDNA log10 ratios
18
16
14
12
10
8
6
4
2
0
1-1.2
1.2-1.5 1.5-1.8 1.8-2.1
2.1-3
Ratio
Figure 1. The distribution of allele-specific expression ratios. In the top plot each dot is the log
10 of the average expression allelic ratio of one of the CEPH samples for cDNA against genomic
DNA. The bottom histogram is the distribution of allelic expression ratios. For consistency,
ratios below 1 were inverted.
11
Average expression ratio
1.4
1.2
1
0.8
0.6
0.4
0.2
0
LS
(n=17)
LAS
LAS
(n=14)
LLGS
GS
(n=3)
LALG
LALG
(n=2)
LALA
LALA
(n=8)
LL & SS
(n=16)
5-HTTLPR genotypes
Average expression ratio
1.4
1.2
1
0.8
0.6
0.4
0.2
0
AT (n=4)
LS (n=14)
Homozygous (n=15)
5-HTTLPR and rs2020933 genotypes
Figure 2. Mean allelic expression imbalance of different genotypes. The error bars are the 95%
confidence interval for the mean. Numbers of heterozygotes for each group are shown in
parentheses. The average allelic expression imbalance for different genotypes at the 5HTTLPR is
shown at top, including the A/G SNP within the long allele. Below is shown the average allelic
expression imbalance for heterozygotes for rs2020933 (AT), heterozygotes for the 5HTTLPR
which are not heterozygotes for rs2020933, and homozygotes for both rs2020933 and the
5HTTLPR.
12
Download