Figure S1 - Sequencing analysis of two mixed barcodes of second

advertisement
Figure S1 - Sequencing analysis of two mixed barcodes of second prototype (AGY) yielded
quantitative measurements. Two plasmid vectors containing known barcode sequences were
mixed at defined proportions and diluted in gDNA. (A) Sequencing analysis of the barcode
revealed the expected quantity of each plasmid. The values from analysis of the
electropherogram were plotted for clone pBS-AGY 1 (B) and pBS-AGY 2 (C), showing close
agreement between the experimental and expected results.
Figure S2 - Quantitative results obtained from second prototype barcode (AGY) library
sequencing. A known barcode clone, pBS-AGY 1, was mixed at defined proportions with the
plasmid library. (A) Sequencing of the barcode revealed the expected quantity of the clone and
the library. The values for the clone (B) and library (C) were plotted, revealing close agreement
between the experimental and expected results.
Figure S3 - Electropherograms obtained from the mixture of the AGY library and a known clone
(Figure S2). The proportions of 1:0 (100% Library : 0% pBS-AGY 1), 3:1 (75% Library : 25%
pBS-AGY 1), 1:3 (25% Library : 75% pBS-AGY 1) and 0:1 (0% Library : 100% pBS-AGY 1)
were sequenced in triplicates from a single PCR product showing the reproducibility of the
sequencing. In these charts it is also possible to visually observe the change in peak area in
relation to the quantity of the clone present in the mixture.
Figure S4 - Immunofluorescence to detect LMO2 expression in transduced cells. NIH 3T3 cells
were transduced or not (control) with the lentiviral vector containing barcode and carrying the
LMO2 transgene. After transduction, cells were stained with polyclonal anti-LMO2 (abcam ab72841). We were able to detect the eGFP protein expressed by the vector and also the
LMO2 antibody, represented in red, in the nuclei of transduced cells. Although there was red
staining in the untransduced cells, we interpret this as non-specific since the LMO2 protein is
exclusively nuclear.
Figure S5 - Flow cytometry for analysis of expression of IL2RG in HT1080 cells. (A)
untransduced cells; (B)
transduction with the lentiviral vector containing barcode; (C)
transduction with the lentiviral vector containing barcode plus the IL2RG transgene . Cells were
stained with PE-conjugated anti-CD132 antibody (clone #31134; R&D Sistems). After labeling,
the cells were permeabilized, leading to loss of eGFP, so only the stain with the antibody is
shown.
Figure S6: Transduction efficiency of HSC. The HSC used in the in vivo assay were analyzed
by flow cytometry for eGFP expression. As seen above, our transduction protocol yielded
approximately 40% transduced cells.
Figure S7 - Hematologic analyses of transplant groups during long term observation. Peripheral
blood was analyzed at the indicated time, points by manual counting of each of the indicated
cell types. (A) White blood cells, (B) lymphocytes and (C) neutrophils. Animals were
transplanted with hematopoietic stem cells that had not been transduced (Control) or
transduced with the vector encoding the library, but no gene of interest (Library) or encoding
LMO2 or IL2RG.
Supplementary Materials and Methods
Myelogram
Morphologic analyses of the bone marrow indicated the presence of two blast populations: one
of smaller size, high ratio of nucleus/cytoplasm, absence of granules, loose nuclear chromatin,
1 to 2 nucleoli of small size; the other population of blasts presented large size, smaller ratio of
nucleus/cytoplasm, intense cytoplasmic basophilia, denser nuclear chromatin and 1 to 3 large,
dysplastic nucleoli and absence of cytoplasmic granulations.
Note that in mice, CD117 is a marker of primitive myeloid and lymphoid precursors. This maker
disappears as B and T cells develop1. Therefore, since we encountered reduction in CD34
(Figure 6D), without a significant difference in CD117 and an increase in the percentage of
immature B cells (Figure 6B), we infer that the blasts are of the B lymphocyte lineage.
Detailed description of calculations
Supplementary analysis A:
Barcode sequence quantification: Step by step analysis of two mixed plasmid vectors
The prototype barcode AGY and the experiment showed in the Figure S1 were used to
exemplify the barcode quantification from a mixture of two plasmid vectors. Below we describe
the steps for this analysis.
Step 1 - Identification of the region of interest from the barcode sequencing result. Only the
positions containing different nucleotides between the two clones were analyzed (red selection).
Barcode sequence
Clone
1st
2nd
position
position
3rd position
4th position
5th position
pBS-AGY1
AGC
AGC
AGC
AGC
AGT
pBS-AGY2
AGC
AGT
AGT
AGT
AGT
Step 2 - PolyPhred results: Since each variable position of the barcode may contain two
different nucleotides, the PolyPhred values of both nucleotides was determined (i.e., for the
prototype barcode AGY, the nucleotides C and T were analyzed at each position). The
PolyPhred values are shown on the following table.
Ratio of Plasmid
pBS-AGY1: pBS-AGY2
1:0
9:1
3:1
1:1
1:3
1:9
0:1
2nd position
C
63174.08
64144.50
65280.00
30558.19
3872.57
2205.63
0.00
T
0.00
2248.21
8104.50
5542.26
15560.19
14151.00
24718.58
3rd position
C
63174.08
64144.50
65280.00
28248.56
6346.71
1195.57
0.00
4th position
T
0.00
5086.12
14502.78
4552.57
15462.32
15880.17
25954.51
C
63174.08
64144.50
65280.00
27964.30
8713.29
5153.34
0.00
T
0.00
8919.13
22394.00
3958.76
13798.66
12315.95
31971.53
Step 3 - Data normalization: The PolyPhred values are then used to calculate the contribution
for each base of the variable site. For each position, the value of each nucleotide is normalized
(expressed as percent) so that the sum of both nucleotides is 100%. The values are shown on
the following table.
Ratio of Plasmid
pBS-AGY1: pBS-AGY2
1:0
9:1
3:1
1:1
1:3
1:9
0:1
2nd position
3rd position
4th position
C
T
C
T
C
T
100.00
96.61
88.96
70.18
19.93
13.48
0.00
0.00
3.39
11.04
29.82
80.07
86.52
100.00
100.00
92.65
81.82
72.19
29.10
7.00
0.00
0.00
7.35
18.18
27.81
70.90
93.00
100.00
100.00
87.79
74.46
74.31
38.71
29.50
0.00
0.00
12.21
25.54
25.69
61.29
70.50
100.00
Step 4 - Quantification results: The mean value of the analyzed nucleotides at the selected
positions was calculated.
Ratio of Plasmid
pBS-AGY1: pBS-AGY2
1:0
9:1
3:1
1:1
1:3
1:9
0:1
pBS-AGY 1
C mean
100.0
92.4
81.7
72.2
29.2
16.7
0.0
pBS-AGY 2
T mean
0.0
7.7
18.3
27.8
70.8
83.3
100.0
Supplementary analysis B:
Barcode sequence quantification: Step by step analysis of barcode library mixed with a known
barcode sequence
The prototype barcode AGY and the experiment shown in Figure S2 were used to exemplify
the barcode quantification from a mixture of barcode library and a known barcode sequence.
Below we describe the steps for this analysis. As the PolyPhred normalized data was
exemplified in the Supplementary Analysis A, these steps will be omitted in this explanation.
Step 1 - The library barcode was sequenced in triplicate to determine the expected mean value
of each possible base at each variable position. This gives a baseline of the library barcode and
is used to calculate the expected values for the contribution of the library to the mixture. Note
that repeated sequencing of the same sample yielded consistent data.
Normalized library values
1st
position
C
T
2nd
position
C
T
3rd position
4th position
5th position
C
C
C
T
T
T
SEQ 1
SEQ 2
SEQ 3
55.07
53.90
55.08
44.93
46.10
44.92
54.99
54.81
53.95
45.01
45.19
46.05
53.31
53.86
50.50
46.69
46.14
49.50
38.77
37.61
39.66
61.23
62.39
60.34
37.95
38.73
40.06
62.05
61.27
59.94
Mean
54.68
45.32
54.58
45.42
52.56
47.44
38.68
61.32
38.91
61.09
Step 2 - Calculate the expected values for the library in the mixture: Based on the mean
normalized value of the library (table above), the expected contribution of the library in each
mixture was calculated.
Normalized library values
Library
percentage
0%
10%
25%
50%
75%
90%
100%
1st position
2nd position
3rd position
4th position
5th position
C
T
C
T
C
T
C
T
C
T
0.00
5.47
13.67
27.34
41.01
49.22
54.68
0.00
4.53
11.33
22.66
33.99
40.78
45.32
0.00
5.46
13.65
27.29
40.94
49.12
54.58
0.00
4.54
11.35
22.71
34.06
40.88
45.42
0.00
5.26
13.14
26.28
39.42
47.30
52.56
0.00
4.74
11.86
23.72
35.58
42.70
47.44
0.00
3.87
9.67
19.34
29.01
34.81
38.68
0.00
6.13
15.33
30.66
45.99
55.19
61.32
0.00
3.89
9.73
19.46
29.19
35.02
38.91
0.00
6.11
15.27
30.54
45.81
54.98
61.09
Step 3 – Sequencing data normalization: PolyPhred values of the experimental samples were
obtained and normalized. The results are shown in the following table.
Ratio of
pBSAGY1:
Library
1:0
9:1
3:1
1:1
1:3
1:9
0:1
Normalized values from experimental samples
1st
2nd
position
C
T
98.36
94.15
88.90
80.10
70.89
61.74
55.07
1.64
5.85
11.10
19.90
29.11
38.26
44.93
position
C
T
100.00
94.59
89.10
80.54
70.32
58.01
54.99
0.00
5.41
10.90
19.46
29.68
41.99
45.01
3rd position
4th position
5th position
C
T
C
T
C
0.00
4.91
10.47
19.86
30.78
40.55
46.69
99.71
94.87
87.22
74.46
60.39
51.51
38.77
0.29
5.13
12.78
25.54
39.61
48.49
61.23
100.00
95.09
89.53
80.14
69.22
59.45
53.31
0.00
0.00
0.00
8.19
25.15
34.89
37.95
T
100.00
100.00
100.00
91.81
74.85
65.11
62.05
Step 4 – Calculate the mean value of pertinent bases in the barcode: Since the pBS-AGY1
clone contributes a single base at each variable site, the mean value for these bases was
calculated in each mixture. This value represents the total contribution of both the clone and the
library. The peak area specific for the clone can be determined if we assume that the
contribution from the library is as determined in step 2. In preparation for this, the mean value
for the relevant base as determined by sequencing is calculated (‘sample mean values’ below,
example highlighted in red below and in step 3) and the expected mean value of the library for
the relevant base was calculated from step 2 (‘expected mean values’ below, example
highlighted in green below and in step 2).
Ratio
Sample mean values
1:0
9:1
99.61
95.74
Expected mean values of
library
0.00
5.23
3:1
1:1
1:3
1:9
0:1
90.95
81.41
69.13
59.16
52.84
13.08
26.16
39.24
47.09
52.32
Step 5 – Quantification of pertinent bases in the barcode: To obtain the final result for the
quantification of the pBS-AGY1 clone in the mixture, the ‘expected mean values of library’ are
subtracted from the ‘sample mean values’ (above) and the result is shown in the table below
(‘pBS-AGY1’). The contribution of the library is calculated by subtracting the ‘pBS-AGY1’ value
from 100.
Ratio of pBS-AGY1: Library
pBS-AGY 1
Library
1:0
9:1
3:1
1:1
1:3
1:9
0:1
99.61
90.51
77.87
55.25
29.90
12.08
0.52
0.39
9.49
22.13
44.75
70.10
87.92
99.48
Supplementary analysis C:
Temporal variance analysis
The calculation of temporal variance is made by comparing successive time points. For this
analysis, the PolyPhred normalized data was obtained as exemplified in the Supplementary
Analysis A, and only one nucleotide at each barcode position was analyzed (the value of the
other possible nucleotide is, by definition, complimentary). There is no need to correct for the
contribution of the specific clone versus the library since we are only interested in the change
observed over time. The absolute value of the difference between normalized values of
successive time points is determined for the chosen nucleotide for each variable position in the
barcode. The final result for the temporal variance is the average of these values.
For example, the table below shows T1 and T2 from the 3:1 mixture (known barcode:library) of
the tissue culture based assay (Figure 4 in the main text). In this example the temporal variance
between T1 and T2 is 2.28.
Normalized values
T1
T2
T1-T2
1st
2nd
3rd
4th
positio
n
31.46
positio
n
44.40
positio
n
38.83
positio
n
46.23
5th
positio
n
31.26
6th
positio
n
16.24
7th
positio
n
67.89
8th
positio
n
72.15
9th
positio
n
65.53
10th
positio
n
73.60
30.04
50.11
36.10
50.99
34.79
15.74
67.94
72.96
63.13
74.46
1.42
5.71
2.73
4.76
3.53
0.5
0.05
0.81
2.4
0.86
Software for temporal variance analysis
Software was developed to facilitate analyses. The software searches for the barcode sequence
in the PolyPhred files, normalizes the nucleotide values of the variable positions and calculates
the temporal variance. The software can be obtained by contacting DBZ
(daniela.zanatta@icesp.org.br) or BES (bstrauss@usp.br).
Reference
1
Bhandoola, A andSambandam, A. (2006). From stem cell to T cell: one route or many?
Nat Rev Immunol 6: 117-126.
Download