Supplementary materials Packaging guest proteins into the encapsulin nanocompartment from Rhodococcus erythropolis N771 Akio Tamura, Yosuke Fukutani, Taku Takami, Yoshihiko Murakami, Keiichi Noguchi, Masafumi Yohda and Masafumi Odaka Supplementary Methods Construction of expression vectors Primer set 1 (see Supplementary Materials, Table S1) for the cloning of Reencapsulin was designed based on the sequence of the BAH36698.1 gene of R. erythropolis PR4. The Reencapsulin gene was amplified via PCR with primer set 1 using R. erythropolis 771 genomic DNA as a template. Amplified DNA fragments were digested with NdeI and XhoI and then inserted into the NdeI and XhoI sites of the pET-23b plasmid to generate pET-Enc. To construct the expression vector (pET-Enc-His) for Reencapsulin with a 6-His-tag (ReencapsulinHis) at the C-terminus, the termination codon of the Reencapsulin gene in pET-Enc was replaced with the codon for Glu using the QuikChange Site-Directed Mutagenesis Kit (Agilent Technologies, Santa Clara, CA) with primer set 2. The gene for DypB peroxidase of R. erythropolis N771 (ReDypB) was amplified via PCR with primer set 3, which was designed based on the putative DypB gene of R. erythropolis PR4 using R. erythropolis N771 DNA as a template. The fragment was digested with NdeI and HindIII and then inserted into the NdeI and HindIII sites of the pET-30b plasmid to generate pET-Dyp. To construct the vector for introducing the C-terminal 37-amino-acid sequence of ReDypB into heterogeneous guest proteins, an NcoI site was inserted just upstream of the codon for the 37th amino acid from the C-terminus of ReDypB of pET-Dyp via site-directed mutagenesis with primer set 4 to generate pET-DypBtag. NdeI and NcoI sites were then introduced into the N- and C-termini, respectively, of EGFP and firefly luciferase (Luc) via PCR with primer sets 5 and 6. In both cases, the DNA sequence of the Strep-tag sequence, WSHPQFEK, was inserted just after the NdeI site. The resulting fragment was digested with NdeI and NcoI and then inserted into the NdeI and NcoI sites of the pET-DypBtag plasmid to generate pET-EGFP-tags and pET-Luc-tags. The constructed EGFP and Luc with tag sequences were designated as EGFPtags and Luctags, respectively. The DNA sequences of all amplified DNA fragments were verified using an ABI3130 Automated Capillary DNA Sequencer (Life Technology, Inc., Carlsbad CA). 1 Preparation of Reencapsulin proteins An Escherichia coli strain, BL21(DE3), which harbored pET-Enc for wild-type Reencapsulin or pET-Enc-His for ReencapsulinHis, was cultured in LB medium containing 100 µg/mL ampicillin at 37°C for 18 h without the addition of isopropyl β-D-1-thiogalactopyranoside (IPTG). The cells were harvested by centrifugation at 5,000 ×g for 10 min at 4°C. Typically 4 g(wet-weight) of E. coli cells were recovered from one litter of the culture. Cells were resuspended in 50 mM Tris-HCl (pH 7.5) containing 25 mM MgCl2 and 1.0 mM dithiothreitol (DTT) and disrupted via sonication; the crude extract was recovered via centrifugation at 20,000 ×g for 15 min at 4°C. For wild-type Reencapsulin, the crude extract was subjected to anion-exchange chromatography using a DEAE Toyopearl 650M column (I.D. 2.5×8.0 cm, Tosoh, Japan) equilibrated with 50 mM Tris-HCl (pH 7.5) containing 25 mM MgCl2 and 1.0 mM DTT. After washing with the same buffer, wild-type Reencapsulin was eluted with 0-500 mM NaCl linear gradient (total volume 200 mL). Fractions containing wild-type Reencapsulin were concentrated with Amicon Ultra (cutoff 10K, Merck-Millipore, Billerica, MA) and then subjected to gel-filtration chromatography using a HiLoad 26/600 Superdex 200 pg column (I.D. 2.6×60 cm, GE Healthcare, Buckinghamshire, England) equilibrated with 50 mM Tris-HCl (pH 7.5) containing 500 mM NaCl, 25 mM MgCl2 and 1.0 mM DTT. For ReencapsulinHis, E. coli BL21(DE3) harboring pET-Enc-His was cultured in LB medium containing 100 µg/mL ampicillin at 37°C without the addition of IPTG. The crude extract prepared as described above was subjected to a HisTrap FF crude column (1.0 mL, GE Healthcare, Buckinghamshire, England) equilibrated with 50 mM Tris-HCl (pH 8.0) containing 500 mM NaCl and eluted with a 0-250 mM imidazole linear gradient (total volume 40 mL). The fractions containing ReencapsulinHis were subjected to gel-filtration chromatography as described above. Both proteins were detected as a single band in SDS-PAGE with Coomassie Brilliant Blue staining (Supplementary Figure X). The purified proteins were stored at -80 °C until use. Preparation of EGFP and Luc with tag sequences and their coexpression with Reencapsulin EGFPtags and Luctags were expressed by culturing E. coli BL21(DE3) harboring pET-EGFP-tags and pET-Luc-tags, respectively, in LB medium containing 70 µg/mL kanamycin at 37°C. When OD600 reached ~0.4, IPTG was added at a final concentration of 1.0 mM, and the cultivation was continued for 18 hours at 20°C. The crude extracts prepared as described above were subjected to a Strep-Tactin Sepharose column (5.0 mL, IBA GmbH, Göttingen, Germany) and eluted based on the manufacturer’s protocol. The fractions containing EGFPtags and Luctags, respectively, were subjected to gel-filtration chromatography as described above. 2 The ReencapsulinHis was coexpressed with EGFPtags or Luctags by culturing BL21(DE3) cells harboring pET-EGFP-tags and pET-Enc-His or pET-Luc-tags and pET-Enc-His in LB medium containing 100 ampicillin µg/mL and 70 µg/mL kanamycin at 37°C. When OD600 reached ~0.4, IPTG was added at a final concentration of 1.0 mM, and the cultivation was continued for 18 hours at 20°C. ReencapsulinHis coexpressed with EGFPtags or Luctags was purified based on the same methods used for ReencapsulinHis. 3 Supplementary Table 1. Primer sequences for construction of plasmids. Substituted bases are written in small letters. Primer set Oligonucleotide sequences 5' GGAATTCcatATGACAAACCTGCACCGCG 3' 1 5' CCGctcgagTCACGCAAGTGAGACAACGG 3' 5' GTCTCACTTGCGTcACTCGAGCACCACCACCACC 3' 2 5' GGTGGTGGTGGTGCTCGAGTgACGCAAGTGAGAC 3' 5' GGAATTCcatATGGCCCTTCCCGCGATAC 3' 3 5' TTgcggccgcTCATTGCTGGGCACTCC 3' 5' CTTCGTTCCGACGccatggTTTCTCGACGATC 3' 4 5' GATCGTCGAGAAAccatggCGTCGGAACGAAG 3' 5' GGAATTCCATATGtggagccatccgcagttcgaaaagGTGAGCAAGGGCGAGG 3' 5 5' CATGccatggGTACTTGTACAGCTCGTCCATG 3' 5' GGAATTCCATATGtggagccatccgcagttcgaaaagGAAGACGCCAAAAACATAAAG 3' 6 5' CATGccatggGTACAATTTGGACTTTCCGCC 3' 4 0 A B Supplementary Fig. S1 (A) Amino acid sequence alignment of selected encapsulin proteins. The predicted secondary structures are arranged above the amino acid residues of Reencapsulin. The Greek alphabets, α, β and η denote α-helix, β-sheet and β-turn, respectively. The conserved regions are surrounded with blue squares. Among them, the homologous amino acid residues are written in red, and the amino acid residues which are identical among all encapsulins are shaded in red. R. erythropolis N771, encapsulin from Rhodococcus erythropolis N771; B. linens, Linocin M18 protein from Brevibacterium linens M18; R. jostii RHA1, Rjencapsulin from Rhodococcus jostii RHA1; Tmencapsulin from Thermotoga maritima. The secondary structures are predicted based on the structure of Tmencapsulin (PDB I.D. 3DKT) by SWISS-MODEL (http://swissmodel.expasy.org/ ). Multiple amino acid sequence alignment 5 was performed by ClustalW (http://clustalw.ddbj.nig.ac.jp/), and the secondary structure depiction was performed by ESPript3 (http://espript.ibcp.fr/ESPript/ESPript/index.php). (B) The predicted structure of the Reencapsulin monomer. The monomer is colored in a rainbow scheme from the N-terminus (blue) to the C-terminus (red). The structure are predicted based on the structure of Tmencapsulin (PDB I.D. 3DKT) by SWISS-MODEL (http://swissmodel.expasy.org/ ). 6 A B C D 0 Supplementary Fig. S2 Size distribution of the nanocompartments of (A) wild-type Reencapsulin, (B) ReencapsulinHis, (C) ReencapsulinHis coexpressed with EGFPtags, and (D) ReencapsulinHis coexpressed with Luctags. For each nanocompartment, 100 particles were randomly chosen from a negative-stained transmission electron micrograph, and then their sizes were determined manually. 7 A B Supplementary Fig. S3. (A) Plot of the apparent molecular weight of the ReencapsulinHis versus elution volume in FFF-MALS. Twenty microliters of ReencapsulinHis was injected into the system. The black line represents the apparent molecular weights estimated from light scattering measurement. The red and blue smooth curves are elution profiles from field-flow fractionation with light scattering at 90° and with differential refractive index. The vertical scale of the elution profiles is in arbitrary units. (B) Negative-stained transmission electron micrographs showing (A) the dimeric particles observed in wild-type Reencapsulin. The picture was obtained after negative staining with 0.5% phosphotungstic acid, and the bars represent 50 nm. 8 Supplementary Fig. S4. Amino acid sequence alignment of the C-terminal regions of selected DypB proteins. Positions that had a single, fully conserved residue are marked with “*”. The putative “signal” sequences of the C-terminal 37 amino acid residues are labeled in a black box. R. erythropolis N771, Rhodococcus erythropolis N771; M. kansasii ATCC12478, Mycobacterium kansasii ATCC 12478; M. tuberculosis, Mycobacterium tuberculosis; M. smegmatis JS623, Mycobacterium smegmatis JS623; R. equi 103S, Rhodococcus equi 103S. Multiple amino acid sequence alignment was performed by ClustalW (http://www.genome.jp/tools/clustalw/). 9 Supplementary Fig. S5. Schematic illustrating the construction of (A) ReencapsulinHis, (B) EGFPtags, and (C) Luctags. 10 000251686912251685888 ReencapsulinHis A B EGFPtags ReencapsulinHis C EGFPtags 0 5 10 15 20 25 30 Elution time [min] Supplementary Fig. S6 Gel-filtration HPLC of (A) empty ReencapsulinHis (100 μg), (B) EGFPtags (100 μg), and (C) free EGFPtags (17 μg) incubated with empty ReencapsulinHis (83 μg) for 10 min at room temperature 4 °C before loading the column. Empty ReencapsulinHis was dissolved in 50mM Tris-HCl(pH8.0),500mM NaCl,1mM DTT while EGFPtags was done in 50mM Tris-HCl(pH8.0),500mM NaCl. All samples were subjected to a WTC SEC column, WTC100-S5 (I.D. 7.8×300 mm, Wyatt Technology Corp., Santa Barbara, CA) connected to a Gulliver 1500 Intelligent HPLC system (JASCO Co., Ltd, Japan). The elution solvent was 50 mM Tris-HCl (pH8.0) containing 500 mM NaCl. The flow rate was 0.50 mL/min, and the elution was monitored by the absorbance at 280 nm. The vertical scale of the elution profiles is in arbitrary units. Each peak was fractioned and then analyzed by SDS-PAGE. In (A) and (C), the peak at approximately 23 min is likely to be DTT from the buffer. As for (C), the fluorescence emission was measured with a GloMax-Multi+ Microplate Multimode Reader (Promega, Madison, WI). The excitation wavelength was 395 nm, and the fluorescence emission at 509 nm was recorded. The fraction of ReencapsulinHis peak exhibited 3.9×102 counts/s whereas that of EGFPtags peak did 2.0×104 counts/s. The protein concentrations of ReencapsulinHis and EGFPtags peaks were 5.1×10-2 and 1.2×10-2 mg/mL, respectively. The 11 background of the fluorescence emission was 3.7×102 counts/s, meaning that the fluorescence emission of the Reencapsulin fraction is almost background level. 12