CAGE cap analysis of gene expression Previously, it was believed

advertisement
CAGE cap analysis of gene expression
Previously, it was believed that the human genome consisted of a few hundreds of
thousands genes, but due to advancements in the analysis of the genomic sequences
and of cDNA projects it was revealed that the total number of the human genome only
consists of 20 000 genes.
It has also been discovered that there are variations in regulation of gene expression.
To unravel the mechanisms of such varieties in genes, it is very important to obtain a
gene expression profile, showing when and where each gene is expressed in what
quantity. CAGE, our original technology, cuts the initial 20 or so base pairs of the 5'
end mRNAs as tags. By combining this with high speed DNA sequencing technology,
it is possible to obtain a genome wide expression profile.
CAGE is based on a series of full-length cDNA technologies developed by the Mouse
encyclopedia project. To produce a CAGE library, firstly cDNA complementary
strands need to be synthesized from total RNA, which is extracted from cells and
tissues, by using random or oligo dT primers. Then the 5' end of cDNA is selected by
using the cap-trapper method. Secondly, a biotinylated linker gets attached to 5' end
of single-strand cDNA acquired from removing the RNA strand using RNasI. This
linker contains recognition sites that are essential for cloning, short specific base
sequences, and endonuclease recognition sites (MmelI). After synthesizing the second
cDNA strand, 20 nucleotides are cut from 5'end by the class II restriction enzyme
(MmeI). This is called a tag sequence. Then two linkers get attached to 3' side of tag
sequence. This linker includes another restriction enzyme site (XmajI). By using the
streptavidin coated magnetic beads, only biotinylated cDNA tag gets extracted and
purified. The recognition site, which is on the both end of the linker, is cut off by the
restriction enzyme, then tags between linkers can be curved off. In this manner, DNA
sequences of CAGE tag can be determined and mapped onto the genome.
Firstly we sequenced a concatemer of more than 10 CAGE tags. In FANTOM3, we acquired a genome
wide map of transcription initiation sites by CAGE.
Furthermore, recent introduction of shot-gun sequencers enabled CAGE to enter the new phase. CAGE
became a powerful tool for quantitative analysis of gene expression. By sequencing all
overlapped/duplicated CAGE tags, mapping them onto genome, and then measuring the level of each
gene’s expression according to the number of mapped tags.
Since shot-gun sequencers can sequence 1,000,000~10,000,000 tags in one run, when they are applied
to CAGE, it is possible, theoretically, to capture RNA molecules with 1copyRNA/1~10cells with more than
99.9% success rate. Currently CAGE is the only available means to analyze genome-wide gene
expression for each promoter.
Presently, the limitation of CAGE is the amount of RNA required for detection. The next target in the
development of CAGE is to reduce the amount of sample required; ultimately, however, detection
technology with only one cell needs to be developed. This is because conditions of cells vary depending
on each cell and if more than a single cell is targeted for measurement in analysis, only averaged results
can be obtained. We thus cannot grasp the unique phenomena at each stage.
Download