Hallwirth et al. Coherence analysis of vector integration patterns For

advertisement
Hallwirth et al.
Coherence analysis of vector integration patterns
For Molecular Therapy
SUPPLEMENTARY METHODS
Transduction of CD34+ cells. Independent batches of MGMT-encoding MFG-based γretroviral vectors were collected from the supernatant of PG13 producer cells. The vectors
were identical, with the exception of an MGMT-P140K mutation in the construct used for
transduction under London conditions. Paris transduction conditions entailed thawing 1×107
cells on Day 1 and pre-stimulating them in X-VIVO 10 medium (Lonza, Australia) + 4%
FCS + cytokines (300 ng/ml Flt3-L, 100 ng/ml TPO [R&D Systems, MN, USA], 300 ng/ml
SCF [Amgen, CA, USA], 60 ng/ml IL-3 [Stem Cell Technologies, Canada]) at 0.5×106
cells/ml in a total volume of 20 ml in one 85 cm2 culture bag for 24 hours 37°C, 5% CO2.
Day 2: Cells (9.2×106 total) were recovered from the culture bag, resuspended in 20 ml
vector supernatant with cytokines (as above + 2 µg/ml protamine) at 0.46×106/ml in a
Retronectin (TaKaRa, Japan)-coated 85 cm2 culture bag and incubated for 24 hours. Day 3:
Cells (10×106 total) were recovered from the Retronectin-coated bag and resuspended in 20
ml fresh vector supernatant + cytokines + protamine (as above), reseeded into the same
Retronectin-coated bag at 0.5×106/ml and incubated for 24 hours. Day 4: Cells (12.8×106
total) were recovered from the Retronectin-coated bag; 1×107 cells from this suspension were
resuspended in 20 ml fresh vector supernatant + cytokines + protamine (as above), reseeded
into the same Retronectin-coated bag at 0.5×106/ ml and incubated for 24 hours. Day 5: Cells
(15.5×106 total) were recovered from the Retronectin-coated bag and washed in 4% human
serum albumin (Albumex 4, CSL, Australia) as would be done for infusion into a patient.
Transduction parameters were analyzed by flow cytometry (Table 1). London transduction
conditions differed from the Paris conditions in the following respects: Cells were cultured in
serum-free X-VIVO 10 medium supplemented with 1% human serum albumin and 20 ng/ ml
IL-3 instead of 60 ng/ml. IL3, TPO and Flt3-L were sourced from Cellgenix, Germany. Cells
1
Hallwirth et al.
Coherence analysis of vector integration patterns
For Molecular Therapy
were pre-stimulated for 40 hours, followed by two 24-hour transductions and one 6-hour
transduction.
Junction fragment library construction. Genomic DNA from transduced cells was extracted
using a Puregene Blood and Cell Culture DNA Kit (Qiagen, Australia), according to the
manufacturer’s protocol for cultured cells. DNA, eluted in DNA Hydration Solution, was
stored at -20°C until use. An LM-PCR method16 was employed to selectively amplify
junction fragments comprising LTR-derived proviral DNA and adjoining host DNA
sequences. The method was adapted to improve linker ligation efficiency and to
accommodate fragment library sequencing on the Illumina Genome Analyzer IIx (GAIIx)
platform. Transduced gDNA was digested with Tsp509I (New England Biolabs [NEB],
Genesearch, Australia; recognition sequence 5’-AATT-3’, leaving four-base 5’ overhangs).
Proteins were subsequently removed by organic extraction and the digested DNA was
precipitated in the presence of glycogen with sodium acetate and ethanol. The overhangs
were partially filled using Klenow Fragment (3’→ 5’ exo-) and dATP (NEB) in NEBuffer 2.
Adapters compatible with the partially filled overhangs as well as overhangs that had not
been successfully filled were made by annealing oligos (Sigma-Aldrich, Australia) 5’GTAATACGACTCACTATAGGGCACGCGTGGTCGACGGCCCGGG-CTGC
and
Phos]TTGCAGCCCG[AmC7] or [5’-Phos]AATTGCAGCCCG[AmC7], respectively,
[5’at
final concentrations of 40 µM each in 10 mM Tris pH 8.0, 0.1 mM EDTA. Annealing was
performed by incubation in a Mastercycler Gradient PCR machine (Eppendorf, Australia) for
2 min at 92°C, followed by a temperature decrease in increments of 0.1°C every 4 sec to
82°C, every 5 sec to 72°C, every 8 sec to 62°C, every 10 sec to 52°C, every 12 sec to 42°C
and every 15 sec to 12°C. Linkers were ligated to digested DNA fragments at a 10-fold molar
excess of linker over cut ends using T4 DNA Ligase (NEB) and supplementation with ATP
2
Hallwirth et al.
Coherence analysis of vector integration patterns
For Molecular Therapy
(NEB) to 1 mM. The number of cut ends was estimated from the median fragment length of
~275 bp and the concentration of Tsp509I-digested DNA. After linker ligation, a second RE
digestion was performed using BpmI (NEB) to cleave the vector-3’-LTR-derived fragments
arising from Tsp509I cleavage, thereby preventing subsequent amplification of an “internal
fragment”. Proteins were removed by organic extraction and DNA was precipitated as above
and reconstituted in water.
LM-PCR amplifications were carried out with ~500 ng adapter-ligated gDNA fragments as
template in 50-µl reactions using HotStarTaq Plus DNA Polymerase (Qiagen), at a final
MgCl2 concentration of 2 mM. The amplification utilized the linker-specific primer L1 (5’GACTCACTATAGGGCACGCGT)
and
the
MLV
LTR-specific
primer
MLV1
(5’-CATGCCTTGCAAAATGGCGTTACTTAAGC) in a touch-down PCR format: 1× 95°C,
5 min; 7× (94°C, 30 sec; 72°C, 1 min); 37× (94°C, 30 sec; 68°C, 1 min); 1× 68°C, 3 min;
hold at 12°C. Amplicons of 120-400 bp were gel-excised and purified using a Wizard SV Gel
and PCR Clean-Up System (Promega, Australia). Amplicons >400 bp were gel-purified
separately and retained for reprocessing.
Nested PCR amplifications were carried out under the same master mix reagent
concentrations and thermal cycling conditions as the LM-PCR, using 5 µl each of 1 in 10
and 1 in 100 dilutions of the 120-400 bp LM-PCR products in 25-µl reaction volumes, and
utilizing a linker-specific primer LNT1 whose 3’ end is complementary to the Tsp509I
recognition site (5’-CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTGTCGACGGC
CCGGGCTGCAATT) and an MLV LTR-specific primer MLVN1 that is recessed four base
pairs
from
the
beginning
of
the
proviral
5’
LTR
sequence
(5’
AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGA
3
Hallwirth et al.
Coherence analysis of vector integration patterns
For Molecular Therapy
TCTGCTTGCCAAACCTACAGGTGGGGTCT). Primers LNT1 and MLVN1 were 5’-tailed
with the Illumina GAIIx single-read specific sequences (underlined) required for capture on
the oligonucleotide lawn on the GAIIx flow cells and subsequent sequencing-by-synthesis.
Nested PCR amplicons were size-selected in the same way as LM-PCR products, but with a
size range of 160-500 bp. The lower limit was chosen so that amplicons would contain at
least 18 bp of genomic DNA sequence adjacent to the ISs, and the upper limit to facilitate
optimal bridge amplification on the Illumina flow cells. LM-PCR and nested PCR steps were
repeated for each sample until the available starting material had been processed.
Gel-purified LM-PCR amplicons >400 bp were digested in parallel with two other REs
having four-base recognition sequences, namely MboI (NEB) and Csp6I (Roche, Australia).
Digested fragments were ligated to linkers having overhangs compatible with the respective
cut ends and used in LM-PCR amplifications using primers MLVN1 and either LNM1 (5’
CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTGTCGACGGCCCGGGCTGCGAT
C) or LNC1 (5’ CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTGTCGACGGCCC
GGGCTGCTA), respectively. Amplification products were size-selected in the same manner
as Tsp509I-generated LM-PCR products. Aliquots of all Tsp509I-, MboI- and Csp6Igenerated LM-PCR products were pooled in proportions such that their final relative
contributions to each of the junction fragment libraries was in accordance with their
estimated proportions within the original LM-PCR products. The junction fragment libraries
were sequenced on an Illumina GAIIx platform in a 1×76 bp read format (Genome Institute
of Singapore), using a custom sequencing primer recessed by two positions relative to the
standard Illumina single-read sequencing primer.
4
Hallwirth et al.
Coherence analysis of vector integration patterns
For Molecular Therapy
Sample code for coherence analysis.
# reading the file
phc001.times=dlmread('PHC001_full_datset_IS_hg18')
phc004.times=dlmread('PHC004_full_datset_IS_hg18')
# dividing by 10^9 for each dataset so it fits into the frame
phc001_1.times=phc001_1.times/1e+09
phc004.times=phc004.times/1e+09
#defining parameters
delay_times=[0 3.0802];
params.Fs=100
params.err=[2 0.0500]
params.fpass=[0 50]
params.pad=0
params.tapers=[50 99]
delay_times=[0 5.01];
# computing coherence
datasp1=extractdatapt(phc001,delay_times,1);
datasp2=extractdatapt(phc004,delay_times,1);
[C1,phi,S12,S1,S2,f,zerosp,confC,phistd,Cerr1]=coherencypt(dat
asp1,datasp2,params);
#plotting coherence
figure;
plot_vector(C1,f,'n',Cerr1-Cerr1,'b'); ylim([0 1]);
5
Hallwirth et al.
Coherence analysis of vector integration patterns
For Molecular Therapy
SUPPLEMENTARY TABLES
Table S1 Transduction performance under London and Paris SCID-X1 trial conditions
Vector preparation
MFG-MGMT_1
MFG-MGMT_2
MFG-MGMT_3 MFG-γc(a)
PBMC_1(b)
PBMC_2(c)
Patient BM(d)
CD34+ after isolation
62.8%
98.6%(e)
79%
Transduction
London
London
London (“L”)(f)
Paris (“P”)(f)
Paris
Final CD34+
90.29%
95.59%
96.41%
49.30%
37%
Transgene+
16.16%
16.36%
16.64%
64.85%
28%
Transgene+ CD34+
15.04%
15.93%
15.98%
28.81%
10%
1.53×
1.72×
1.74×
2.16×
3.59×
Donor cells
Proliferation
(a)
For treatment of SCID-X1 patient. See ref. 12 in main document.
(b)
Harvested 2005 from pediatric oncology patient; frozen as bulk; selected on day 0.
(c)
Harvested 1997 from pediatric oncology patient; CD34+ selected and cryopreserved.
(d)
BM, bone marrow.
(e)
CD34-positivity after thawing.
(f)
Transduced cells referred to as L and P in main document.
Table S2 Distribution of integration sites (ISs) relative to genic categories
Dataset
Total ISs TSS-proximal(a)
Intragenic(a)
Intergenic(a)
MRC
300 000 16 399 (5.47%) 125 864 (41.95%) 157 737 (52.58%)
P
250 213 66 989 (26.77%) 108 033 (43.18%) 75 191 (30.05%)
L
54 431
12 538 (23.03%) 23 356 (42.91%) 18 537 (34.06%)
SCID1_Paris
9 852
2 567 (26.06%) 3 804 (38.61%) 3 481 (35.33%)
SCID1_London 3 470
995 (28.67%) 1 367 (39.39%) 1 108 (31.93%)
(a)
Defined in Materials and Methods of the main document.
6
Hallwirth et al.
Coherence analysis of vector integration patterns
For Molecular Therapy
Table S3 Fisher’s exact test (two-tailed) p-values of TSS-proximal integration
site count comparisons (from Table S2)
MRC
P
L
< 0.0001
< 0.0001
< 0.0001
< 0.0001
< 0.0001
0.1173
0.0128
< 0.0001
< 0.0001
P
SCID1_Paris SCID1_London
L
SCID1_Paris
0.0030
Table S4 Fisher’s exact test (two-tailed) p-values of intragenic integration site
count comparisons (from Table S2)
MRC
P
L
< 0.0001
< 0.0001
< 0.0001
0.0025
0.2558
< 0.0001
< 0.0001
< 0.0001
< 0.0001
P
SCID1_Paris SCID1_London
L
SCID1_Paris
0.4179
Table S5 Fisher’s exact test (two-tailed) p-values of intergenic integration site
count comparisons (from Table S2)
MRC
P
P
L
SCID1_Paris SCID1_London
< 0.0001
< 0.0001
< 0.0001
< 0.0001
< 0.0001
< 0.0001
0.0170
0.0145
0.0107
L
SCID1_Paris
0.0003
7
Hallwirth et al.
Coherence analysis of vector integration patterns
Table S6 Parameters for coherence analysis
Parameter
Whole genome
Chromosome 19
3000
1000
[0 1500]
[0 500]
err
[2 0.0500]
[2 0.0500]
Pad
0
0
[50 99]
[50 99]
109
107
Fs
fpass
tapers
delay_times (divide factor)
8
For Molecular Therapy
Hallwirth et al.
Coherence analysis of vector integration patterns
For Molecular Therapy
SUPPLEMENTARY FIGURE LEGENDS
Figure S1 Association between annotated genomic features and MLV vector integration
sites. Increased integration near the indicated feature, calculated by statistical comparison
against matched random controls using the ROC area method (references 6 and 32 in the
main manuscript), is shown in red, decreased integration in blue, with the intensity of shading
correlating with the degree of departure from random integration. Calculations of statistically
significant differences in abundance (relative to random integration), indicated by asterisks,
are calibrated against the SCID1_Paris dataset; * p < 0.05, ** p < 0.01, *** p < 0.001. Details
of relative integration abundance are available as “Supplementary report - Association of
Genomic Features with Integration”.
Figure S2 Overrepresentation of MLV vector integration sites at coding genes.
Overrepresentation values relative to random sites were calculated for proportions of
integration sites falling within ±100 kb of the TSS of each known coding gene. A subset of
oncogenes was extracted from this list of genes, leaving “other coding genes” (n = 17 822).
Oncogenes associated with hematological malignancies were designated “hematological
oncogenes” (n = 89), leaving “other oncogenes” (n = 1 852). Mean overrepresentation values
were calculated for each gene category within the two experimental transduction datasets and
SCID1_Paris. Mean overrepresentation values were compared for different gene categories
within the same datasets, and equivalent gene categories between datasets. All comparisons
of mean overrepresentation values showed statistical support of differences (independent ttests, p < 0.05), except where indicated. Error bars indicate the standard errors of the means.
9
Hallwirth et al.
Coherence analysis of vector integration patterns
For Molecular Therapy
SUPPLEMENTARY REPORT
Supplementary report - Association of Genomic Features with Integration. This is available
as a separate file. Within this report, datasets Fr1 and En2 correspond to datasets P and L in
the main manuscript, respectively. Dataset MLV is not relevant to the main manuscript.
10
Download