Gentile el al., Supplementary Information Table of Contents Supplementary Results ......................................................................................................................... 1 Setup of a novel gene trapping strategy ........................................................................................... 1 Rationale ...................................................................................................................................... 1 Construction and validation of trapping vectors .......................................................................... 1 Methods ........................................................................................................................................ 3 Mapping of trap integration sites by RAN-PCR .............................................................................. 4 Rationale ...................................................................................................................................... 4 Methods ........................................................................................................................................ 4 Supplementary References ................................................................................................................... 5 Supplementary Results Setup of a novel gene trapping strategy Rationale We conceived a functional genomics approach to the study of invasive growth by identifying genes that are transcriptionally regulated by HGF in vitro in the MLP-29 mouse embryo liver cell line (Medico et al. 1996). Such genes are likely to be involved in the control of invasive growth and thus may directly contribute to the invasive-metastatic phenotype of neoplastic cells. Our approach entailed a strategy based on gene trapping. Gene trapping exploits random integration of vectors carrying a reporter gene not preceded by a promoter, to generate reporter cell clones displaying the transcriptional activity of virtually any gene in the genome. We have previously developed a vector for systematic trapping of transcriptionally regulated genes, ROSA-GFNR (Medico et al. 2001), that, for positive selection of traps in expressed genes requires cell sorting, a time-consuming and technically challenging procedure. To overcome this limit, we designed a new selectable reporter gene, named PGN after the derivation from a fusion of three components: puromycin resistance (PuroR), GFP and E. coli Nitroreductase (NTR). The PuroR component brings the key innovation to the trapping construct, allowing a simpler positive selection with a pharmacological treatment instead of cell sorting. Construction and validation of trapping vectors We built the triple fusion PGN by linking with a Ser-Gly spacer the PuroR, GFP and NTR elements, and evaluated PGN efficiency by transient and stable transfection. Each element maintained its functionality with a slight loss of efficiency in the triple fusion (5-10%), compared to the double fusions (Supp. Fig. 1A). We subsequently treated stable PGN transfectants for different periods and 1 with different doses of either puromycin or MN. We observed that after treatment with low doses of puromycin for a week, not all cells displayed detectable GFP fluorescence. This indicated that very low levels of PGN expression are not sufficient to confer detectable fluorescence, but sufficient to confer resistance to puromycin. Conversely, higher drug doses progressively increased the fraction of fluorescent cells in a quick (2-3 days) and dose-dependent fashion, with the possibility to use very high doses (20g/ml) to select cells expressing very high levels of PGN. As expected, MN treatment of low puromycin-resistant cells counter-selected the brighter cells in a dose-dependent manner, with selection efficacy progressively more effective with increased treatment duration (57days; Supp. Fig. 1B). This indicated that a combined use of low puromycin and high MN allows selection of cells expressing PGN at very low levels. After the preliminary validation, we cloned PGN between a strong cellular splice acceptor (SA) and a polyadenylation site (pA), obtaining the ‘Triple ACtivity Trap’ (TRACT) cassette. To overcome possible limits due to the presence of backbone-specific integration “hot spots” in the genome (De et al. 2005; Wu et al. 2003), the cassette was inserted into three different backbones: plasmidic, retroviral (Roberts et al. 1998) and lentiviral(Follenzi et al. 2000), to obtain P-TRACT, R-TRACT and L-TRACT, respectively (Supp. Fig. 1C). Initially, P-TRACT was tested on HEK-293T cells, because of their high transfection efficiency. 106 cells were transfected with 105 molecules of linearized plasmid, and selected with puromycin. Over 100 cell clones were counted, which confirmed that trapping events in basally expressed genes can be promptly and efficiently isolated also from a plasmidic trap. The new trapping cassette was inserted into retroviral and lentiviral backbones, and used in a wide screening yielding 179 clones carrying integrations into genes transcriptionally regulated by HGF. The screening strategy described here relies on a well-established functional genomics technique, gene trapping (Gentile et al. 2003), and on a new selectable reporter, PGN. Despite being a triple fusion protein, PGN retained good efficiency of each domain and allowed efficient positive and negative pharmacological selection together with direct monitoring of expression by fluorescence analysis. Interestingly, most traps selected for positive or negative response to HGF showed a regulation of 1,5-2 fold, indicating that a modest transcriptional response to HGF is sufficient for PGN to enable trap selection. In this view, possible applications of PGN as a selectable reporter may extend beyond the gene trapping constructs described here. The flexibility and efficiency of the PGN-based trap screening procedure make it suitable for wide surveys of transcriptionally regulated genes, with two interesting features that render it complementary to microarray analysis: (i) gene traps randomly integrate in the genome, virtually interrogating any transcriptionally active domain of a cell, while microarrays interrogate a defined number of transcripts, exploring only a fraction of the transcriptional potential of the genome (Katayama et al. 2005) and not considering antisense transcription (Carninci et al. 2005); (ii) gene trapping generates reporter cell clones that can be used in high-throughput screenings to identify upstream genes, or small molecules or peptides, that modify the expression of the trapped gene. 2 Methods The first step of PGN construction was the optimization of the design of double GFP-fusion proteins, GFP-NTR and PuroR-GFP, as we have previously seen that the fusion between GFP and NTR (GFNR) results in a slight loss of fluorescence and killing efficiency (10–15%; (Medico et al. 2001)). To generate an improved GFP-NTR fusion, we inserted a spacer of 15 aminoacids (3x SerGly-Gly- Gly-Gly) between GFP and NTR. To this aim, the NTR coding sequence was amplified from pGFNR (Medico et al. 2001) using a sense primer containing the PstI restriction site followed by the sequence coding for the Gly-Ser spacer and by the initial NTR coding sequence (5’AACTGCAGGAGGCGGTGGATCAGGCGGTGGAGGCTCTGGTGAAGGCGGTTCCGGAGG CGGTTCAGCCATGGATATCATTTCTGTC-3’), and an antisense primer containing two restriction sites (SacI and SpeI) and the 3’ NTR coding sequence (5’AGAGCTCACTAGTTTACACTTCGGTTAAGGT-3’). The PCR product was bluntized, digested with PstI and introduced in a modified pEGFP-C1 (Clontech), previously digested with XhoI and EcoRI, then bluntized and re-closed, downstream and in frame with the GFP coding sequence. The new plasmid, called pGFNR2, resulted to be 5-10% more efficient than pGFNR (not shown). We therefore decided to use the Gly-Ser linker also in the construction of the PuroR-GFP fusion, for which the puromycin resistance gene (PuroR) was amplified by PCR from a PuroR-containing plasmid with the following primers: (i) sense primer: 5’CCAAGCTTAACCATGACCGAGTACAAGCCCACG-3’, and (ii) antisense primer (containing the 3x GlyGlyGlyGly-Ser linker sequence): 5’GGAATTCTGAACCTCCGCCACCTGAACCTCCGCCACCTGAACCTCCGCCACCGGCACC GGGCTTGCGGGTC-3’.The PCR product was digested with HindIII and EcoRI enzymes and cloned in pEGFP-C1, upstream and in frame with the GFP coding sequence. The new plasmid was called pPURO-GFP. To obtaining the PGN triple fusion, the PuroR+linker sequence was excised from pPURO-GFP and introduced into pGFNR2, upstream and in frame with the GFP-NTR coding sequence. The new expression vector was called pPGN. All constructs were sequence-verified before their use for transfection experiments. The trapping cassette (SA-PGN-PA) was constructed by excising the SA sequence from p-SA--Gal (Friedrich and Soriano 1991) and inserting it upstream of PGN in the pPGN plasmid, in substitution of the CMV promoter. The new plasmid was called pSA-PGN. Subsequently, the bovine growth hormone poly(A) sequence was excised from pSA--Gal and inserted downstream from PGN in the pSA-PGN plasmid. The new plasmid, containing the trapping cassette SA-PGN-PA, was called pP-TRACT. The lentiviral version of the trap (pL-TRACT) was obtained by substituting the GFP gene and its PGK promoter in the pRRL.SIN.cPPT.PGK.GFP (Follenzi et al. 2000), with the trapping cassette in reverse orientation with respect to the vector. The retroviral version of the trap (pR-TRACT) was obtained exchanging PGK.YFP of the MLV-based SIN retroviral vector pRkat43.3.PGK.YFP (Roberts et al. 1998), with the trapping cassette in reverse orientation with respect to the vector. Viral traps were produced and titrated according to standard procedures (De et al. 2005). 3 Mapping of trap integration sites by RAN-PCR Rationale We developed a new strategy to identify the genomic regions flanking the integrated gene trap. The approach takes advance of the presence of repeat sequences in the genome (Alu for human, B2 for mouse) to “anchor” PCR products containing the trap sequence. The strategy is therefore named “Repeat-ANchored” PCR (RAN-PCR), and is performed on genomic DNA extracted from trap clones. The advantage of this procedure over existing ones is that it does not require any enzymatic reaction (digestion/ligation of the DNA, reverse transcription of the RNA) before PCR and is therefore more amenable to high-throughput. Methods RAN-PCR is an improvement of the Alu-PCR protocol(Cole et al. 1991). By Alu-PCR, it is possible to identify the vector integration sites performing a PCR on human genomic DNA with a primer designed on the known vector sequence, and the other primer on the Alu sequences known to be frequently repeated in the genome. The limit of this technique is a high number of aspecific Alu-Alu amplification products. To overcome this limit, and to adapt the procedure to mouse genomic DNA, we designed a new procedure, illustrated in Supplementary Figure 2, based on a primer matching the B2 repeat, that is frequent in the mouse genome (Krayev et al. 1982), flanked by a “suppression sequence” at its 5’. The primer containing the suppression sequence (underlined) was named Supp-B2 and had the following sequence: 5’CTAATACGACTCACTATAGGGCGGCTGGTGAGATGGTTCAGT-3’. A second primer, containing only the suppression sequence underlined above, was named Supp. The two antisense primers designed on the viral vectors are: 5’-GCCTCAATAAAGCTTGCCTTG-3’, 5’TCCCAGGCTCAGATCTGGTCTAAC-3’, for the first PCR and the nested PCR respectively. When B2-B2 amplification occurs, the resulting amplificates contain at both ends the suppression sequence, which anneals intramolecularly forming a panhandle-like structure that can not be amplified, greatly increasing the specificity of PCR amplification. Genomic DNA was extracted from trapped cell clones using Blood & Cell Culture DNA Midi Kit (Qiagen). PCR reactions were performed in a volume of 25 l with 250 ng of genomic DNA with 12.5 M antisense primer, 0,2 M Supp-B2 primer and 1 M Supp primer, using Advantage PCR mix (Clontech) with the following cycles: 5x (94°C 10 sec, 66°C 10 sec, 72°C 3 min), 5x(94°C 10 sec, 70°C 10 sec, 72°C 3 min, 25x (94°C 10 sec, 68°C 10 sec, 72°C 3 min). 1 l of the PCR reaction was used as a template for the nested PCR, with the same conditions but the nested antisense primer. 4 Supplementary References Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, et al. (2005). The transcriptional landscape of the mammalian genome. Science, 309, 1559-1563. Cole CG, Goodfellow PN, Bobrow M and Bentley DR. (1991). Generation of novel sequence tagged sites (STSs) from discrete chromosomal regions using Alu-PCR. Genomics, 10, 816-826. De PM, Montini E, de Sio FR, Benedicenti F, Gentile A, Medico E and Naldini L. (2005). Promoter trapping reveals significant differences in integration site selection between MLV and HIV vectors in primary hematopoietic cells. Blood, 105, 2307-2315. Follenzi A, Ailles LE, Bakovic S, Geuna M and Naldini L. (2000). Gene transfer by lentiviral vectors is limited by nuclear translocation and rescued by HIV-1 pol sequences. Nat Genet, 25, 217222. Friedrich G and Soriano P. (1991). Promoter traps in embryonic stem cells: a genetic screen to identify and mutate developmental genes in mice. Genes Dev, 5, 1513-1523. Gentile A, D'Alessandro L and Medico E. (2003). Gene trapping: a multi-purpose tool for functional genomics. Biotechnol Genet Eng Rev, 20, 77-100. Katayama S, Tomaru Y, Kasukawa T, Waki K, Nakanishi M, Nakamura M, Nishida H, Yap CC, Suzuki M, Kawai J, Suzuki H, Carninci P, Hayashizaki Y, Wells C, Frith M, Ravasi T, Pang KC, Hallinan J, Mattick J, Hume DA, Lipovich L, Batalov S, Engstrom PG, Mizuno Y, Faghihi MA, Sandelin A, Chalk AM, Mottagui-Tabar S, Liang Z, Lenhard B and Wahlestedt C. (2005). Antisense transcription in the mammalian transcriptome. Science, 309, 1564-1566. Krayev AS, Markusheva TV, Kramerov DA, Ryskov AP, Skryabin KG, Bayev AA and Georgiev GP. (1982). Ubiquitous transposon-like repeats B1 and B2 of the mouse genome: B2 sequencing. Nucleic Acids Res, 10, 7461-7475. Medico E, Gambarotta G, Gentile A, Comoglio PM and Soriano P. (2001). A gene trap vector system for identifying transcriptionally responsive genes. Nat Biotechnol, 19, 579-582. Medico E, Mongiovi AM, Huff J, Jelinek MA, Follenzi A, Gaudino G, Parsons JT and Comoglio PM. (1996). The tyrosine kinase receptors Ron and Sea control "scattering" and morphogenesis of liver progenitor cells in vitro. Mol Biol Cell, 7, 495-504. Roberts MR, Cooke KS, Tran AC, Smith KA, Lin WY, Wang M, Dull TJ, Farson D, Zsebo KM and Finer MH. (1998). Antigen-specific cytolysis by neutrophils and NK cells expressing chimeric immune receptors bearing zeta or gamma signaling domains. J Immunol, 161, 375-384. Wu X, Li Y, Crise B and Burgess SM. (2003). Transcription start regions in the human genome are favored targets for MLV integration. Science, 300, 1749-1751. 5