Sequence of the A4varICAM locus and adjacent genes

advertisement
1
Supplemental Information
2
Classification of var genes for study
3
Var genes are large, two exon structures, with exon 1 coding for the surface-
4
exposed portion of PfEMP1. This region is, generally, highly variable in length (2.7-10.4
5
kb) and sequence, both within and between isolates. Based on the 3D7 genome sequence,
6
exon1 of most var genes is composed of Duffy Binding-like (DBL) domains of multiple
7
sequence groups , , , ,  or X, and cysteine-rich interdomain regions (CIDR; see
8
Smith 2000 for nomenclature). The 57 complete var genes of 3D7 that encode an N-
9
terminal domain of type DBL1 followed by a CIDR domain are very polymorphic
10
within the 3D7 genome [(Gardner et al., 2002) and latest release at www.sanger.ac.uk],
11
and they are also not shared as intact genes in other parasite genomes [(Taylor et al.,
12
2000) and Kraemer et al submitted].
13
In 3D7, 23 of the 57 polymorphic var genes are in chromosome-central clusters
14
and tend to have 5' promoter sequences of type upsB or upsC. Most of the remaining
15
polymorphic var genes are located adjacent to and transcribed away from the telomere,
16
and have upsB type 5' sequences. A small number of var genes are adjacent to these
17
upsB-type vars, transcribed towards the telomere, and have upsA1 (formerly upsA) type
18
5' promoters. The highly polymorphic var genes in 3D7 are thus the most abundant var
19
type, and they have upsA1, upsB or upsC 5' promoters.
20
In contrast, a total of 5 var genes in the 3D7 genome are highly conserved
21
between isolates, having upsA1 (formerly upsA; type 3 var), upsA2 (formerly upsD;
22
var1csa; pseudogene PFE1640w) and upsE (var2csa) type promoters (Rowe and Kyes,
23
2004; Trimnell et al., 2006). Although var2csa and type 3 var appear to be regulated by
1
mutually exclusive gene expression, the upsA2 type var1csa gene is expressed
2
independently of phenotype in parasites where the gene is intact (Kyes et al., 2003). We
3
considered these semi-conserved, subtelomeric genes to be unusual representatives of the
4
var gene family, on the basis of having unusually short introns and/or unusual 5’
5
promoter regions, both sites being implicated in gene regulation (Calderwood et al.,
6
2003; Deitsch et al., 1999; Kraemer and Smith, 2003; Lavstsen et al., 2003). However,
7
they are investigated here for completeness.
8
9
10
Map and sequence of the A4varICAM/R29R+var1 locus
The 3D7 isolate, on which the published genome is based, is particularly
11
unsuitable for deriving phenotypically homogeneous parasites, due to low levels of
12
PfEMP1 expression at the RBC surface. Instead, we use the IT isolate, for which
13
cytoadherent phenotypes have been well-characterized. We wanted to examine RNA
14
polymerase activity both within and near the A4varICAM gene, because IT parasite
15
populations can be selected to high homogeneity for expression of the protein this gene
16
encodes. This most closely approximates a population that is ‘clonal’ for var expression.
17
The genome sequence of IT is not yet available, so we first mapped, cloned and
18
sequenced the region surrounding A4varICAM (Figure S1).
19
Using a combination of PCR and vectorette cloning, with confirmation by
20
Southern blot and restriction mapping, we constructed a contig sequence for one end of
21
the IT genome chromosome 13, corresponding to the 'left' end of 3D7 chromosome 13
22
(mapping by hybridisation to the same end as the gene for glycophorin binding protein
23
homologue 2; gbph2; accession no X69769; PF13_0010). We showed previously by
1
mapping that the A4varICAM and upsA type R29R+var1 genes are located in tail-to-tail
2
orientation at this end of chromosome 13, in most IT parasites except those that express
3
the R29R+var1 gene (Horrocks et al, 2004).
4
Non-coding sequences near var genes show much more similarity than coding
5
regions (Taylor et al, 2000; Kraemer et al submitted). Conservation in non-coding
6
sequences may indicate conservation of function, or may simply reflect that the coding
7
segments, exposed at the host-parasite interface, are under constant selection. Whatever
8
the underlying cause, conserved non-coding genomic contexts for these variant antigen
9
families in 3D7 allows prediction of gene organization in new isolates. From the 3D7
10
genome information we were able to predict rif gene positions relative to the var genes.
11
In 3D7, most subtelomeric upsB/upsA var gene pairs have a rif sequence located between
12
them, and most upsA vars are arranged head-to-head with a rif gene (Gardner et al 2002).
13
The distance between A4varICAM and R29R+var1 is only ~1kbp, with no open reading
14
frames in the intergenic sequence. However, we amplified nine rif genes from PFG -
15
purified chromosome 13 DNA and by chromosome pulsed field gel hybridisation and
16
restriction mapping we determined that two of these rif genes were within ~25kbp of the
17
R29R+var1 gene. We mapped one of these rif genes, ITrif13.1, as closest to the
18
R29R+var1 gene, and we were able to extend the contig sequence out to this gene on the
19
basis, from similar var pairs in 3D7, that it would be in head-to-head orientation with
20
R29R+var1. The second rif gene, ITrif13.2, lies telomere-distal of ITrif13.1 within
21
~12kbp. Although we were able to amplify several stevor sequences from A4
22
chromosome 13 DNA, none of these mapped to the same ApaI fragment as A4varICAM,
23
and therefore these are probably located at the other telomere.
1
In the R29 clone, the A4varICAM gene has been deleted, and the telomere is
2
adjacent to the 3' end of R29R+var1, leaving only ~800bp between the stop codon of
3
exon2 and the telomere repeat sequence. In R29 genomic DNA, the Chromosome 13
4
‘left’ telomere repeats end at a standard CA breakpoint, determined using telomere PCR.
5
Unfortunately, this technique is not sufficiently sensitive to detect deletions in
6
heterogeneous populations, even if that population is expressing R29R+var1. From clone
7
R29 genomic DNA sequence, we can define the sufficient length of 3' down-stream
8
sequence for proper regulation of var gene expression as being 790bp. 3'RACE suggests
9
that the transcript ends approximately ~410-460nt (not shown) after the stop codon.
10
Only two subtelomeric upsB/upsA1 var pairs in 3D7 are similar to
11
A4varICAM/R29R+var1 in organization, having no rif gene between them: PF11_0007/
12
PF11_0008 (intergenic distance 1067bp) and PF08_0141/ PF08_0142 (1046bp).
13
Although the distance between these two subtelomeric, tail-to-tail 3D7 var pairs is similar
14
(compared to 1030bp in A4/R29), the sequences are not similar. Better matches for
15
A4var-R29var intergenic sequence are found in chromosome central var intergenic
16
sequence. The A4varICAM upsB sequence is a common upsB type, with the full 1500bp
17
highly similar to other upsB for subtelomeric vars. The R29 5' untranslated upsA1 type
18
sequence is most similar to that for the conserved 'type3' vars.
19
20
1
2
3
Supplemental Information Methods
4
Chromosome pulsed field gel (PFG) Southern blots, PFG separation blots of DNA
5
digested in agarose blocks with rare-cutting restriction enzymes, or linear electrophoresed
6
blots of liquid DNA digested with frequent-cutting enzymes were prepared as previously
7
described (Smith et al., 1995). The distance to the end of the chromosome from the
8
R29R+var1 gene in R29 parasites was estimated by restriction digest mapping with the
9
rare cutter BglI and with BglII and EcoRI. PCR with a primer 200bp 3' of the end of
Genomic mapping of expressed var genes, and telomere PCR
10
R29exon2 (R29-3UTF: 5'-ATTTTGTATTTATTTGACAC) to a telomere repeat primer
11
(5'-TGAACCCTGAACCCTGAACCC) was used to confirm this distance. PCR using
12
previously reported telomere repeat primers either failed or did not work as efficiently as
13
this primer. Conditions for PCR: 1.5mM MgCl2, 200M dNTPs, 1u Perkin Elmer Taq
14
polymerase per 50l reaction, each primer at 1M, and 100ng genomic DNA; 95oC
15
3min, followed by 30 cycles of 94oC 30sec, 45oC 30sec, 65oC 4min, followed by 65oC
16
10min. A4 and R29 genomic DNA templates were compared, and a fragment unique to
17
R29 was gel-purified then cloned into pCR2.1 TA vector (Invitrogen); sequences were
18
determined by BigDye sequencing/ABI (Applied Biosystems) analysis.
19
20
Chromosome 13 subtelomeric rif gene identification
21
Localization of specific rif genes neighboring the A4varICAM and R29R+var1 genes was
22
performed by separating chromosomes on PFG, staining representative lanes in ethidium
23
bromide, then excising regions of the gel (not stained) corresponding to chromosome 13.
24
A small segment of gel was then equilibrated in 10mMTris, 1mM EDTA 10 minutes at
1
room temperature, the equilibration buffer was removed, and the tube was placed in a
2
boiling water bath for several minutes, until agarose was molten. Approximately 9
3
volumes of sterile water were added, mixed, and the tube placed in a boiling water bath
4
for 30 sec. This was used as a PCR template. PCR for rif genes was performed with
5
generic, degenerate primers designed to a single class of rif genes
6
rifF4: ATTCCA/CACATGTA/GTA/TTG
7
rifR1: CTTCAA/TTTTA/GTTA/TTTTC/TG/TG/A/TCGATAACG
8
Primers were designed to a second class of rif genes, and although they amplified a
9
product from genomic DNA, they never yielded any product on RT-PCR, so this class
10
was not included in this study. Reactions were performed using standard conditions,
11
3mM MgCl2, with 95oC 3min, followed by 30sec 94oC/30sec 42oC/60sec 65oC, 30
12
cycles. PCR fragments were cloned and sequenced as above. Unique sequences were then
13
hybridised to PFG chromosome separation blots, and to ApaI and ApaI/BglI digest blots,
14
for confirmation of position.
15
16
RTPCR, probe labeling and Southern blots for var expression analysis
17
We confirmed expression of A4varICAM in parasites with high monoclonal antibody Bc6
18
positivity by DBL1alpha-tag RTPCR of ring stage RNA (Bull et al., 2005). RTPCR
19
products were 32P-labeled (Megaprime, GE Healthcare/Amersham Biosciences, as per
20
manufacturer’s instructions) and hybridised to a Southern blot containing a panel of A4-
21
genomic DNA derived DBL1alpha tag PCR fragments. Blots were exposed to regular
22
speed frilm (Hyperfilm MP; GE Healthcare/Amersham Biosciences) or fast film (Biomax
23
MS; Kodak), for various lengths of time. Genbank accession numbers AJ319680-
1
AJ319712 correspond to tag numbers A4AFBR1-43 (consecutive as listed in Figure S2).
2
A4AFBR tags missing from list, eg A4AFBR3, were not submitted to GenBank due to
3
sequence chimeras and discrepancies, therefore these are not shown here.
4
5
Hybridisations PCR fragments of var and rif genes, and single-copy markers contained
6
in plasmids or as PCR fragments, were labeled with alpha32P-dATP (Megaprime,
7
Amersham). Hybridisations were generally at 60oC, (but 65oC hybridisation for gene-
8
specific exon1 probes, and alphaAF’-BR RTPCR probes) in 7% SDS, 0.5M Na-
9
Phosphate buffer pH 7.2, 2% dextran sulfate, 1mM EDTA; washes were at 68oC in
10
11
0.1xSSC (0.1xSSC= 0.015M NaCl, 1.5mM Na citrate, pH7) 0.1%SDS.
1
Supplemental Information Figure legends
2
Figure S1. Restriction map of A4varICAM/R29R+var1 locus, and relationship to
3
published and new sequences.
4
We used restriction mapping to orientate these genes within the parasite genome, and
5
found that they were both located on Chromosome 13, with slight differences between
6
A. A4 cloned parasites and B. R29 clone parasites.
7
The 5' end of the R29R+var1 gene lies on a 110kb ApaI/BglI fragment in both A4 and
8
R29 clone genomic DNA. This same ApaI/BglI fragment hybridizes to GBPH2,
9
glycophorin binding protein homologue 2. The 3' end of the R29R+var1 gene is on a
10
40kb BglI fragment in A4 clone parasites, and a 6kb fragment in R29 clone parasites.
11
Correspondingly, the entire A4varICAM gene is on a 40kb BglI fragment in A4 parasites.
12
Although DBL1, DBL2 and part of DBL3 of A4varICAM are present in R29, this
13
fragment of A4varICAM has rearranged to a different chromosome (not shown).
14
In A4 and related parasites, orientation of the A4varICAM gene has been confirmed as
15
tail to tail with R29 by restriction digestion, then by PCR to span the distance between the
16
genes. In R29 parasites, the 790bp from the 3’ end of the R29R+var1 gene to the
17
telomere repeats has been cloned and sequenced (the breakpoint occurs between positions
18
12693 and 12694 in the contig sequence, accession no. AM411451). ITrif 13.1 lies on a
19
15kb EcoRI fragment adjacent to the R29R+var1 gene; PCR and sequencing confirmed
20
that this rif is arranged in head-to-head orientation with the var gene. A second rif gene,
21
ITrif13.2, mapped to the same end of chromosome 13, centromeric to ITrif13.1, on the
22
same ApaI fragment. R= EcoRI, B=BglI, ApaI. Published sequences indicated by
23
horizontal bars. Accession numbers: A4varICAM: L42244; R29R+var1: Y13402;
1
R29R+var1 exon2: Y13402, AJ535777; R29 5′: AJ582223; Contig of entire locus:
2
AM411451; partial ITrif13.2: AM411450.
3
4
Figure S2. At 10 hours post invasion, RNA polymerase activity is detected for only
5
the expected dominant var gene, not multiple var genes
6
Run-on probe was prepared from a relatively homogeneous IT parasite population
7
(selected three times on monoclonal antibody Bc6) at ring stage, approximately 10 hours
8
post-invasion. Only A4varICAM appears to be transcribed across all DBL domains. PCR
9
fragments indicated are: var gene DBLs (numbers) and tag (t) A4varICAM
10
(A4varICAM); CS2 (CS2var; DBL1 sequence is identical to that of A4varICAM); Tres
11
(A4TresICAMvar); 17 (tag for ITg-ICAMvar); R29 (R29R+var1); D1 (Dd2var1); var1
12
(var1csa); Stage-specific genes T (MSP1); R (KAHRP).
References
Bull, P.C., Berriman, M., Kyes, S., Quail, M.A., Hall, N., Kortok, M.M., et al. (2005)
Plasmodium falciparum variant surface antigen expression patterns during malaria. PLoS
Pathog 1: e26.
Calderwood, M.S., Gannoun-Zaki, L., Wellems, T.E., and Deitsch, K.W. (2003)
Plasmodium falciparum var genes are regulated by two regions with separate promoters,
one upstream of the coding region and a second within the intron. J Biol Chem 278:
34125-34132.
Deitsch, K.W., del Pinal, A., and Wellems, T.E. (1999) Intra-cluster recombination and
var transcription switches in the antigenic variation of Plasmodium falciparum. Mol
Biochem Parasitol 101: 107-116.
Gardner, M.J., Hall, N., Fung, E., White, O., Berriman, M., Hyman, R.W., et al. (2002)
Genome sequence of the human malaria parasite Plasmodium falciparum. Nature 419:
498-511.
Kraemer, S.M., and Smith, J.D. (2003) Evidence for the importance of genetic structuring
to the structural and functional specialization of the Plasmodium falciparum var gene
family. Mol Microbiol 50: 1527-1538.
Lavstsen, T., Salanti, A., Jensen, A.T., Arnot, D.E., and Theander, T.G. (2003) Subgrouping of Plasmodium falciparum 3D7 var genes based on sequence analysis of coding
and non-coding regions. Malar J 2: 27.
Rowe, J.A., and Kyes, S.A. (2004) The role of Plasmodium falciparum var genes in
malaria in pregnancy. Mol Microbiol 53: 1011-1019.
Smith, J.D., Chitnis, C.E., Craig, A.G., Roberts, D.J., Hudson-Taylor, D.E., Peterson,
D.S., et al. (1995) Switches in expression of Plasmodium falciparum var genes correlate
with changes in antigenic and cytoadherent phenotypes of infected erythrocytes. Cell 82:
101-110.
Taylor, H.M., Kyes, S.A., Harris, D., Kriek, N., and Newbold, C.I. (2000) A study of var
gene transcription in vitro using universal var gene primers. Mol Biochem Parasitol 105:
13-23.
Trimnell, A.R., Kraemer, S.M., Mukherjee, S., Phippard, D.J., Janes, J.H., Flamoe, E., et
al. (2006) Global genetic diversity and evolution of var genes associated with placental
and severe childhood malaria. Mol Biochem Parasitol 148: 169-180.
Table S1. Oligonucleotides used for PCR fragments on Southern blots, and for probes; 89% Bc6+ run-on results.
Fragment
template
HRP1
genomic
MSP1
A4-5UT
A4-DBL1
A4-DBL3
A4-DBL5
A4-intron
A4-exon2
A4-3UT
genomic
plasmid
YAC
YAC
YAC
genomic
plasmid
plasmid
Forward primer sequence (5’—3’)
Reverse primer sequence (5’—3’)
CAACAAATGCTGCTACACCAG
TTTAACCACAGCATCCTC
summary*
conditions
expected
run-on
product size signal
2.5/50
450bp
rings
GTCAAAAAACTAGAAGCTTTAG
ATCAATTAAATATTTGAAACC
1.5/50
450bp
trophs
AATATGGAAGTAACGGAAT
GCTATCCAATACATGTTTGGCATC
1.5/50
989bp
rings
ATGAATATCATACTAATGTTA
ATATTCCGTATGAGAAAATGT
3.0/50
1.2kbp
rings
ACCAAGTTGGATGTGTGCGCC
AGAAGAATAACCTTTTTCTTTTAG
2.0/50
1.1kbp
rings
TCTATTTTAGACAGTACATTTG
TGTCCTATCCTGTGTATATAAT
2.0/50
900bp
rings
GGCATTAGGATCCATTGC
GTCGACAGGGTGTTTAG
2.5/45
1kbp
rings
GTTACACCGATCATTATAGTG
CTCATTTTCCCACTCTT
1.5/50
1.6kb
rings
CAAATTGGTGAAAGAG
ATAATATCAAATATATATATC
1.5/45
300bp
rings
R29-3UT
R29-exon2
R29-intron
R29-DBL4
R29-DBL1
R29-5UT
plasmid
plasmid
plasmid
plasmid
genomic
plasmid
var-rif
-intergenic
plasmid
ITrif13.1
plasmid
ITrif13.2
stevor
A4Tres
DBL1
plasmid
plasmid
plasmid
TCCTATATCAGATGTATG
TATACAAATAATCAAATGTGC
1.5/45
350bp
trophs
GGAAGGAGATTCAGATGA
TAGGTGTATCCACGTTTG
3.5/50
890bp
rings/trophs
ACAACCATTCCTTTTGGAG
GTATGTATGTATATATGTATGTA
1.5/45
750bp
trophs
GATGTTTTATACTTTAGG
CTCTTATCACTCACAAGC
1.5/50
1.1kbp
trophs
GGGAATTCGAGTACACCGAAGGTAGAAAG
GGGAATTCTTCACAATATCCTGAAGGACC 2.5/45
1kbp
negative
TGTTATTAGCAGTACAATG
AATTTCAATAAACATGTTCTC
2.5/50
1kbp
negative
TTCTATTATGTTCAATTA
GTGTTGTATTCATTCAAG
2.5/45
1.2kbp
negative
CTCATGGGAAGTTGTTGC
TAAAACTATAGCTAGTATTGT
1.5/50
430bp
negative
GCGGCGATGCCTGAAGTG
AGAAGTCTGAAAACTAGT
2.5/45
450bp
negative
AAATGTTATTGTTTAC
CCAAAGCTGCAATACCAC
3.0/45
700bp
negative
CGGAATTCAGACAACCGGTTCGATTTTCC
CGGAATTCCTAAGATGAACTTTGCGTCTG
1.5/50
800bp
negative
A4Tres
DBL3
plasmid
ITgICAM
DBL1 tag
plasmid
ITgICAM
DBL2
genomic
Dd2var1
DBL1
genomic
Dd2var1
DBL4
genomic
FCR3-var1
5UT
genomic
FCR3-var1
DBL1
genomic
FCR3-var1
DBL3
genomic
FCR3-var1
DBL7
genomic
FCR3-var1
intron
genomic
CGGAATTCACAGAGGACGCAAAATGGAA
CGGAATTCCTATGTATAATCCAACGATGC
1.5/50
800bp
negative
GCACGA/CAGTTTT/CGC
GCCCATTCG/CTCGAACCA
2.5/45
400bp
negative
GGTTTAAAATAGGAACAC
GTTAGAAGCCATTTGTGC
1.5/50
900bp
rings
CAAGGACGTTTGTCAGAAGC
GATTACATGCATACAAACAG
3.5/50
1kbp
rings
ATACGGCAAAACCGCACC
TCCATTTACACATTTGTC
3.5/45
1.3kbp
negative
AAAGAAAGAACGTGACGC
TCTAATGATGATGCTGCATTCC
1.5/50
650bp
negative
TCTACGCGAGTAAATAAGC
GACAAATTTGTTATCGTTCG
1.5/50
1kbp
rings, trophs
CAAGTAGAAGATTGTCATCC
CTGTTCAAGTAATCTGTTGC
3.5/45
1kbp
rings, trophs
AATCCATTGGATAATTGTCC
AACTCCAAAGCGCATTGAG
1.5/50
800bp
(negative/
PCR product failure)
AAGATCAATCTTCAG
AGGCATTCCATACTCTC
1.5/50
750bp
negative
FCR3-var1
exon2
genomic
TTCAAATCGTCTGTGGAC
TATCAATAGGTTTAGCAC
1.5/50
650bp
rings/trophs
upsC
genomic
ACAAACATAGTGACTACC
GCCCATTCSTCGAACCA
2.5/50
1.2kb
rings
GTCGAAATCAATGTACCAG
TCACATAGCGATGGCACG
3.0/45
820bp
negative
TAAGGAAAACATAGACACTG
ATGGAATGCGTCACTTCACG
3.5/45
840bp
negative
ACGAACCAATATTCCAATGCT
ATTTTTTGCATGTAGGTATGAT
2.0/50
850bp
rings
GACAACAGTCATAGTGGAGC
GAGGGTACAAGCGTCATCC
2.0/50
1.1kbp
negative
GTACCCTCAAATATAGTG
CATGGATCACAATAATCTG
2.0/50
1.1kbp
negative
3D7
3D7 genomic ACTATAAGATAAATTTAAGAGA
PFL0030c5UT
TATCATATTTCTTGTAATAGC
2.5/45
700bp
negative
3D7
3D7 genomic CTAAATAGTTAGACATATAAC
PFL0030c5utORF
ACTTGATTTATCCATTTTGTC
1.5/45
650bp
negative
3D7
3D7 genomic CTTGTGATAGAATACC
PFL0030cDBL1
TTTGTTGATATAATTCTG
2.5/45
800bp
negative
ITvar1
DBL1
genomic
ITvar1
DBL4
genomic
A4var-DBL2 genomic
CS2var-DBL2 genomic
CS2var-DBL3 genomic
3D7
3D7 genomic ATACTATAATACATGGAG
PFL0030cDBL3
CATTATTAGTGCATGCGTC
2.5/45
700bp
negative
3D7
3D7 genomic CTTCGGACATTAATAAAGGTGTGC
PFL0030cDBL6
CAATTATTTTTAACTTCTGTGTCATC
1.5/45
800bp
negative
3D7
3D7 genomic ACGTGTACTTGATATACC
PFL0030cexon2
TTTCCATCTGATCGTCAC
2.5/45
1.2kbp
negative
A4-var3
5’UT
A4 genomic
ATATTATGGATAATACAGATAG
TCTATACCAAATGATTGCCAT
1.5/50
500bp
negative
A4-var3
DBL1
A4 genomic
ACAGTAATGCTGGAGCATGT
CACACTTTGGATGTGTCAA
2.0/50
500bp
negative
A4-var3
DBL2
A4 genomic
ATCCATTAGAAAAATGTC
GTTTTGTTACGTATGATG
1.5/45
872bp
negative
A4-var3
intron
A4 genomic
GGTGTCGCCTTAACTCTA
ATTTGTAATATCCGTATCATAT
3.5/50
242bp
negative
A4-var3
exon2
A4 genomic
TGAAGTAGATATGATACG
GAAATTGTTGTTCCAACG
1.5/45
1.2kbp
rings
A4-var3
3’UT
A4 genomic
TGTCATTGTACATAATTCAATA
GGTATTTTACAACTTATGATAC
3.5/50
840bp
negative
* Concentration MgCl2/Annealing temperature, oC. Genomic = A4 genomic DNA. YAC = A4var-containing YAC.
Download