Supplementary Materials - methods (doc 255K)

advertisement
Supplementary material to:
Methylome-wide comparison of human genomic DNA extracted
from whole blood and from EBV transformed lymphocyte cell lines
Karolina Åberg (PhD)a*; Amit N. Khachane (PhD)a; Gábor Rudolf (PhD)a; Srilaxmi
Nerella (MS)a; Douglas A. Fugman (PhD)b; Jay A. Tischfield (PhD)b; Edwin J.C.G. van
den Oord (PhD)a
a
Center for Biomarker Research and Personalized Medicine, School of Pharmacy, Virginia Commonwealth
University, PO Box 980533, Richmond ,VA 23298, USA; b Department of Genetics, Rutgers University,
145 Bevier Road, Piscataway, NJ 08854, USA.
*Correspondence to: Karolina Åberg, Center for Biomarker Research and
Personalized Medicine, School of Pharmacy, Virginia Commonwealth University,
1112 East Clay Street, P.O. Box 980533, Richmond, VA 23298.
Tel: +1 804-628 3023, fax: +1 804-628 3991, e-mail: kaaberg@vcu.edu
Methods
Lymphocyte cell line (LCL) establishment and DNA extraction
Lymphocytes cell lines (LCLs) were established by separating lymphocytes from
whole blood by centrifugation on a Nycoprep density gradient (Axis-Shield, Oslo,
Norway) and transformed utilizing Epstein-Barr virus (EBV) isolated from B958 cell line
(in house preparation) according to standard operating procedures in place at the Rutgers
University Cell and DNA repository (RUCDR). Briefly the lymphocyte layer from the
Nycoprep gradient was washed, re-suspended in culture medium with 25% fetal calf
serum(FCS), RPMI-1640, 0.1% phytoheaglutinin (PHA), and incubated at 37°C, 5% CO2
in a humidified incubator with the EBV. Cultures were maintained by twice weekly
examination and medium supplementation with 15% FCS/RPMI-1640 as needed. After
5-6 weeks when the transformed LCLs cultures exhibited the desired density of healthy
aggregates, they were transferred to larger flasks for expansion for DNA extraction and
cryopreservation. For all samples included in this investigation DNA was extracted from
the LCLs at the same passage as when they were cryopreserved. DNA samples were also
extracted directly from aliquots of WB from the same samples as were used to create the
LCLs. DNA was extracted from LCL cultures or WB using AutoPURE LS auotomated
DNA extractors (Qiagen) utilizing standard PureGene extraction reagents1,2. This is an
inorganic, salt-precipitation (i.e., phenol free) method that eliminates the hazards of
phenol exposure and alleviates environmental concerns. All buffers and reagents are
standardized and meet Qiagen’s strict quality control procedures. DNA quality, of DNA
from WB and from LCL, was verified using restriction enzyme digestions and agarose
gel electrophoresis, PCR, and by UV spectroscopy according to RUCDR standard
2
operating procedures. RUCDR maintains secure, state-of-the-art facilities where each
operation is computerized to minimize sample mislabeling and/or cross-contamination.
Probe correlation
Probes that have low variation between the two blood samples but a high
variation between samples from different individuals will have a high probe correlation
indicating a variable methylation site. A low probe correlation, the variation between the
two samples from the same individual is high, is likely to indicate a methodological issue,
such as a failing probe or an empty probe (a probe located in a genomic region without
any methylation sites). To identify the variable methylation sites, we use a previously
developed procedure 3. In short, the array signal yijk for biosample i on probe j and
replicate number k can be written as:
yijk = mj + aij + eijk
(Equation 1)
where mj is the average signal at probe j, aij the biosample specific deviation at
probe j, and eij the measurement error for biosample i on probe j for replicate k. In this
study, we obtained two replicates, k=1..2, and calculated for a given probe j the Pearson
(product moment) correlation between the two replicates using the data from all
biosamples. This correlation is labeled the “probe correlation”. It can be shown that the
correlation for probe j across the two replicates equals:
COR(yi1, yi2 ) j 
VAR
(A) j
VAR
(A) j VAR
(E) j
(Equation 2)
where VAR(A)j and VAR(E)j are the variances of the methylation signals and
error, respectively. This probe correlation is an index of the signal-to-error ratio, as it
equals the biological variation in methylation signals across biosamples divided by the
total variance.
3
Sample correlation
The sample correlation for a given biosample i equals the correlation between the
two replicates calculated across the data from all probes. Using assumptions similar to
those upon which equation (Equation 2) is based, the sample correlation for biosample i
measured on two occasions equals:
C O R( y j1 , y j 2 ) i 
VAR
(M ) i
VAR
(M ) i  V A R
( E) i
(Equation 3)
where VAR(M)i is the variance in methylation signals across all probes for
biosample i and VAR(E)i is the variance in the measurement error across all probes for
biosample i. If measurement error is large relative to differences among probes in their
methylation status, in addition to observing low probe correlations, we would expect the
sample correlations to be low.
Inter-correlation between adjacent probes
To investigate the methylation pattern, we combined highly inter-correlated
adjacent probes into blocks. Differences in block structures indicate differences in the
methylation pattern between WB DNA and LCL DNA. To create these blocks, we used a
two-step algorithm. Starting with the first two probes in the p-telomer on each
chromosome, we first calculated the inter-correlation between adjacent probes and kept
adding probes to that “block” until the average inter-correlation dropped below a
threshold of 0.5. The idea is that the methylation signal will span a larger chromosomal
region but that altered methylation patterns may cause the inter-correlation to drop below
our threshold, thereby producing multiple blocks. As poor probes (i.e., probes with a
large measurement error) will also “break up” methylation patterns, we used a second
step. In this second step, we calculated the average inter-correlation between probes in
4
adjacent blocks. If the adjacent blocks were no further apart than 500 bp and their
average inter-correlation was higher than our threshold of 0.5, we combined them again
into a single block. The R script for block construction and the blocks created in this
investigations are made available through the authors web sites
http://www.people.vcu.edu/~kaaberg/
5
Results
The distributions of Cohen’s D is shown in Figure S1. Similar to figure 1, this
figure shows that the majority of probes show small differences between the technical
duplicates of WB while much bigger difference are observed for the comparisons with
WB vs. LCL DNA.
Figure S1. The distribution of Cohen’s D for duplicates of WB DNA (top) and each of the WB
samples vs. LCLs (middle and bottom, respectively) are shown. Probes with complete data from all
samples that showed inter-individuals variation are included.
6
Acknowledgment
Control subjects were obtained from the National Institute of Mental Health
Schizophrenia Genetics Initiative (NIMH-GI), data and biomaterials were collected by
the "Molecular Genetics of Schizophrenia II" (MGS-2) collaboration. The investigators
and coinvestigators are: ENH/Northwestern University, Evanston, IL, MH059571, Pablo
V. Gejman, M.D. (Collaboration Coordinator; PI), Alan R. Sanders, M.D.; Emory
University School of Medicine, Atlanta, GA, MH59587, Farooq Amin, M.D. (PI);
Louisiana State University Health Sciences Center; New Orleans, Louisiana, MH067257,
Nancy Buccola, APRN, B.C., M.S.N. (PI); University of California-Irvine, Irvine, CA,
MH60870, William Byerley, M.D. (PI); Washington University, St. Louis, MO, U01,
MH060879, C. Robert Cloninger, M.D. (PI); University of Iowa, Iowa, IA,MH59566,
Raymond Crowe, M.D. (PI),Donald Black, M.D.; University of Colorado, Denver, CO,
MH059565, Robert Freedman, M.D. (PI); University of Pennsylvania, Philadelphia, PA,
MH061675, Douglas Levinson M.D. (PI); University of Queensland, Queensland,
Australia, MH059588, Bryan Mowry, M.D. (PI); Mt. Sinai School of Medicine, New
York, NY,MH59586, Jeremy Silverman, Ph.D. (PI).The samples were collected by
Vishwajit Nimgaonkar's group at the University of Pittsburgh, as part of a multiinstitutional collaborative research project with Jordan Smoller, M.D., D.Sc., and Pamela
Sklar, M.D., Ph.D., Massachusetts General Hospital(grant MH 63420). Data and
biomaterials used in Study 23 were collected by the University of Pittsburgh and funded
by an NIMH grant (Genetic Susceptibility in Schizophrenia, MH 56242) to Vishwajit
Nimgaonkar, M.D., Ph.D. Additional Principal Investigators on this grant include Smita
Deshpande, M.D., Dr. Ram Moanohar Lohia Hospital, New Delhi, India; and Michael
7
Owen, M.D., Ph.D., University of Wales College of Medicine, Cardiff, UK. Most
importantly, we thank the families who have participated in and contributed to these
studies.
References
1.
Sahota A, Brooks AI, Tischfield JA: Protocol 1: Preparing DNA from Cell
Pellets; in Genetic variation; a laboratory manual; in: Weiner MP, Gabriel S,
Stephens JC (eds): Genetic variation: a laboratory manual. Cold Spring Harbor,
New York: Cold Spring harbor Laboratory Press, 2007, pp 107-109.
2.
Sahota A, Brooks AI, Tischfield JA: Protocol 6: preparing DNA from Blood:
Large-Scale Extraction; in: Weiner MP, Gabriel S, Stephens JC (eds): Genetic
variation: a laboratory manual. Cold Spring Harbor, New York: Cold Spring
Harbor Laboratory Press, 2007, pp 124-128.
3.
Meng H, Joyce AR, Adkins DE et al: A statistical method for excluding nonvariable CpG sites in high-throughput DNA methylation profiling. BMC
Bioinformatics; 11: 227.
8
Download