Global identification of human transcribed sequences with genome

Materials and Methods Human genome sequence processing The template sequences used to design the microarrays were derived from UCSC HG13/NCBI Build 31 (UCSC Human Genome Project Working Draft, November 2002 assembly) of the human genome sequence assembly (1). Each chromosome sequence was screened for repetitive elements and low-complexity DNA with RepeatMasker (2) in sensitive mode, in conjunction with the RepBase collection of repetitive sequence elements (3, 4). Additional low-complexity sequence filtering was performed with the NSEG program (5) using a minimum segment length of 21 nucleotides, trigger complexity of 1.4, and extension complexity of 1.6. Oligonucleotide probe selection Following repeat sequence identification and low-complexity filtering, the remaining 1.5 Gb of nonrepetitive DNA was processed by the NASA Oligonucleotide Probe Selection Algorithm (NOPSA), which is designed to identify optimal hybridization probes according to several criteria: 1) nucleotide frequency information, calculated to determine the uniqueness of every 36mer in the genome; this measure was used to select probes that occur less than 5 times on average to reduce the potential for cross-hybridization; 2) intra-oligo alignment scores, used to exclude sequences that could form a loop with a stem greater than 7 bases; 3) other sequence-dependent factors such as length, extent of complementarity and overall base composition. A total of 51,874,388 36mer oligonucleotide probes were selected to represent both sense and antisense strands of the nonrepetitive DNA at an average resolution of 46 nt (probes are spaced every 10 nt on average). For DNA synthesis purposes, each selected probe was designed not to use more than 75% of the maximum number of cycles required to synthesize an oligomer of that length. For 36mer oligonucleotides this cutoff was 108 cycles. Virtual masks describing the layout of each microarray design were developed with the ArrayScribe program (NimbleGen Systems, Madison, WI). Microarray construction High-density oligonucleotide arrays were fabricated as described (6-8). Briefly, DNA synthesis reagents (Glen Research, Sterling, VA), (Proligo, Boulder, CO), (Amersham Pharmacia, Piscataway, NJ), (Applied Biosystems, Foster City, CA) were used in conjunction with Expedite DNA synthesizers (Applied Biosystems). Maskless Array Synthesis (MAS) units (NimbleGen Systems, Madison, WI) were connected to the DNA synthesizers to manufacture custom arrays using photolabile phosophoramidites (NPPOC- dAdenosine (N6-tac) -Cyanoethylphosphoramidite, NPPOC-dCytidine (N4Isobutyryl) -Cyanoethylphosphoramidite NPPOC-dGuanosine (N2-ipac) Cyanoethylphosphoramidite, NPPOC-dThymidine-Cyanoethylphosphoramidite) obtained from Proligo (Boulder, CO). After synthesis on the MAS was completed, the baseprotecting groups were removed in a solution of ethylenediamine:ethanol (1:1 v/v) (Aldrich, St. Louis, MO) for two hours. The arrays were rinsed with water, dried and stored desiccated until use. Sample preparation and labeling Triple-selected human liver tissue poly (A)+ RNA pooled from several individuals was obtained from Ambion (Austin, TX). First-strand cDNA was generated via M-MLV reverse transcriptase (RNase H-) with equimolar concentrations of oligo(dT) primers and random decamers. The reactions were carried out at 42°C for 2 hours in the presence of amino allyl-dUTP to facilitate the secondary labeling of an amine-reactive fluorescent conjugate. Following reverse transcription the products were heated at 95°C for 5 minutes to denature the RNA:DNA hybrids and heat-inactivate the reverse transcriptase, after which the RNA template was hydrolyzed via incubation with NaOH at 65°C for 15 minutes. Reverse transcription products from 20 separate reactions were produced in this manner and pooled to reduce technical variability between samples. The cDNAs were precipitated in ethanol:isopropanol (1:1 v/v) and resuspended in 0.1 M NaHCO3 to facilitate coupling of Alexa Fluor 555 NHS esters (Molecular Probes, Eugene, OR) to the reactive groups of the amino allyl-dUTPs. Following incubation at room temperature for 1 hour, the labeled cDNAs were purified with CyScribe GFX glass fiber spin columns (Amersham Bioscience, Piscataway, NJ) and isopropanol-precipitated. Microarray hybridization Microarrays were hybridized with 2-3 µg labeled cDNA in 320 µL hybridization buffer (50 mM MES, 0.5 M NaCl, 10 mM EDTA, and 0.005% (v/v) Tween-20) for 20 hours at 50°C. Hybridizations were performed in disposable adhesive chambers (Grace BioLabs, Bend, OR) in a hybridization oven with constant agitation. After hybridization, the arrays were washed on an orbital platform in non-stringent buffer (6× SSPE, 0.01% [v/v] Tween-20) for 10 minutes at room temperature, then in stringent buffer (100 mM MES, 0.1 M NaCl, 0.01% Tween-20) for 30 minutes at 45°C. This was followed by a 5-minute wash in non-stringent buffer and a 4-minute wash in 0.2X SSC. The arrays were dried with compressed nitrogen. Images were acquired with an Axon 4000B laser scanner at 5µm resolution and intensity data were extracted with NimbleScan software (NimbleGen Systems). Validation of TAR sequences RT-PCR was performed to validate 96 transcriptionally active regions (TARs) from throughout the genome. PCR primer pairs were designed for each region and the reactions were carried out with the RETROscript system (Ambion) with 0.25 µg poly (A)+ human liver RNA for each sample tested. An identical aliquot of each reaction mixture was used as a minus-reverse transcriptase control. PCR products were electrophoresed on 2.5% fine-resolution agarose gels; 90 of the 96 displayed bands of the appropriate size with no detectable band in the negative control, while 10 of the positive amplicons produced multiple bands in addition to the targeted product. References 1. E. S. Lander et al., Nature 409, 860 (2001). 2. A. F. A. Smit, P. Green, unpublished results. 3. J. Jurka, Trends Genet. 9, 418 (2000). 4. RepBase Update, volume 7, issue 2. 5. J. C. Wooton, S. Federhen, Methods Enzymol. 266, 554 (1996). 6. S. Singh-Gasson et al., Nat. Biotechnol. 17, 974 (1999). 7. E. F. Nuwaysir et al., Genome Res. 12, 1749 (2002). 8. T. J. Albert et al., Nucleic Acids Res. 31, e35 (2003).

Global identification of human transcribed sequences with genome

Related documents

Products

Support

Global identification of human transcribed sequences with genome

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib