Supplementary information Materials and Methods RT-PCR One-step RT-PCR (Invitrogen) was used to obtain the fragments of the full-length of DP100. Total RNAs extracted from CT26 tumors were used as templates. Forward and reverse primers were designed according to the sequences of mouse genome contig NT_082868, as discussed in the Supplementary results section. Reverse transcription was performed at 50C for 30 min. After incubation at 94C to inactivate reverse transcriptases and activate DNA polymerase, PCR reactions were followed. To address whether the RT-PCR products were due to contamination of genomic DNA, control reactions were performed where the reaction mixtures were held at 4C at the stage of reverse transcription (RT) so that no RT occurred, and processed through all the other steps for RT-PCR. One-step RT-PCR products were sequenced by the institutional sequencing core facility using ABI 377 DNA sequencers. Aliquots of the RT-PCR products were resolved on 1% agarose gels. 5'- and 3'-RACE GeneRacer (Invitrogen) using RNA ligase-mediated rapid amplification from the 5' and 3' ends (RLM-RACE) was applied to reach the 5' and 3' ends of the full-length DP100. In brief, 5 µg of total RNA from CT26 tumor was treated as described by the manufacturer and ligated to the GeneRacer RNA oligo. The primers used to produce the first strand cDNAs were either the Oligo dT GeneRacer primer or a reverse primer specific for the DP100 sequence (primer RACE100_AS: 5'-AAC GTA CAT GCG s1 CCA GTC GAA GGA-3'). For the 5'-RACE, cDNAs transcribed with the OligodT primer or the RACE100_AS primer were used as the templates, the GeneRacer 5' primer (G5) served as the forward primer, and either the RACE100_AS primer or a primer corresponding to more to the 5' end of DP100 sequence (primer DP100_AS: 5'-TCT CCT AAA CTG CTC TGG TCA GCC TCC ATT A-3') was used as the reverse primer. For the 3'-RACE, cDNAs transcribed with the Oligo dT primer were used as templates, the GeneRacer 3' (G3) primer served as the reverse primer, and a gene specific primer for the extended sequence of DP100 corresponding to more to the 3' end (primer ext6720: 5'-TTC AAG TCC CTG CGG TGT CTT TG-3') was used as the forward primer. Aliquots of RACE products were resolved on 1% agarose gels. Results Identification of the full-length of hcn gene 5’ RACE of DP100 RNA was used to extend to the 5' end of the full-length hcn gene (Figure S1a). The hcn sequence was identical to part of the sequence of only one genomic contig from the NCBI mouse genome database, namely NT_082868. BLAST search revealed numerous EST hits matched sequentially to the sequence of NT_082868 starting from the nucleotide corresponding to the 5’ end and extending ~7-kb downstream (data not shown). Proposing that the 7-kb hcn gene was within this region, 4 sets of primers were designed to amplify the additional segments of this region. Products were generated in all four sets of RT-PCR reactions (Figure S1b, lanes 1, 4, 7, and 10). Sequences of all four RT-PCR products were assembled into a 6850 bp fragment. Finally, 3'-RACE was used to reach to the 3' end of the HCN transcript. A DNA band of ~300 bp was observed (Figure S1c). Sequencing of the s2 DNA revealed the authentic sequence of NT_082868 followed by a stretch of 24 As at the 3' end. Because the 24 A segment was not present in the genomic sequence at this position in NT_082868, we concluded that the poly A tail indicated the 3' end of the HCN transcript. Table Table S1 In situ hybridization analysis of MALAT-1 in various human carcinomas. Breast Cancer 10 Numberb (%) with Elevated HCN 8 (80) Pancreas Cancer 12 7 (58) 0 Lung Cancer 32 16 (50) 0 Colon Cancer 18 9 (50) 0 Prostate Cancer 32 7 (22) 0 Cancer Type Numbera Analyzed Numberc with Decreased HCN 0 a Number of tumor/non-tumor pairs from different human patients in the tissue microarray slides. b,cThe intensity of MALAT-1 staining was scored as undetectable or low (0), moderate (1), strong (2), or very strong (3). The numbers represent tumor samples with higher b or lower c scores than the corresponding non-tumor pairs. s3 Figure legend Figure S1 Identification of the full-length of hcn gene. (a) Identification of the 5’end (5’-RACE), (b) middle section (RT-PCRs), and (c) 3’-end (3’-RACE) of the fulllength DP100 transcript. For the 5’-RACE, the DP100_AS (for lanes 1 and 3) or the RACE_100AS (lanes 2 and 4) were used as reverse primers, and the first strand cDNAs were generated with either the OligodT primer (lanes 1 and 2) or the RACE100_AS primer (lanes 3 and 4). For the RT-PCRs, primers were generated to amplify nt 1-2170 (I), nt 1290-3830 (II), nt 3740-5830 (III), and nt (5600-6850) of the proposed full-length DP100 gene. RT, reverse transcription; DNA, genomic DNA from CT26 tumors. s4