doc

advertisement
BB30055: genes & genomes
MV Hejmadi 2004-05
Post-genomics
Limitations of the HGP: What genomics cannot do is predict what proteins are encoded by the genes, their
functions and their interactions at the cellular level. The challenge post-genomics includes transcriptome and
proteome analysis, which includes identification and quantitation of proteins, their cellular localisation as well
as modification, interactions, activities and finally, function.
The information from any genome sequencing database facilitates studies on the following
(A) Identifying genes from the sequence: 2 general approaches to gene hunting
Ab initio method (computational): This method involves scanning DNA sequences (Bioinformatics) for
special features associated with genes, including detection of exons and other sequence signals like splice
sites, by computational methods. A number of software used for automated annotation of genes like
GENSCAN, GENEBUILDER etc are being used. These software employ a range of strategies including
 Scanning ORFs (open reading frames) – initiation or termination codons
o Codon bias found in specific species
o Exon-intron boundaries
o Upstream control sequences – e.g conserved motifs in transcription factor binding regions
o CpG islands
 Homology searches
Experimental method: Experimental evaluation based on the use of transcribed RNA to locate exons and
entire genes from DNA fragment. These include
a) Hybridisation approaches – Northern Blots, cDNA capture / cDNA select, Zoo blots
b) Transcript mapping: RT-PCR, RACE etc.
(B) Gene expression profiling (determining gene function): Uses either or both
COMPUTATIONAL APPROACH: Homology searches for either orthologous genes (homologues in
different organisms with common ancestor) or paralogous genes (genes in the same organism, e.g.
multigene families)
EXPERIMENTAL APPROACH: Includes functional analysis of known genes using methods such as
a) gene inactivation (knockouts, RNAi, site-directed mutagenesis, transposon tagging, genetic
footprinting etc)
b) gene overexpression (transgenics, reporter genes, knock-ins, etc)
(C) Genome activity studies: Functional gene expression on its own is not enough. It needs to be complemented
by transcriptome and proteome analyses in order to understand how the cell operates.
The transcriptome (global mRNA profiling)
The transcriptome can be defined as the complete collection of transcribed elements of the genome. In
addition to mRNAs, it also represents non-coding RNAs, which are used for structural and regulatory
purposes. Alterations in the structure or levels of expression of any one of these RNAs or their proteins
can contribute to disease. An understanding of the transcriptome will provides clues on
 Regions of transcription
 Transcription factor binding sites
 Sites of chromatin modification
 Sites of DNA methylation
 Chromosomal origins of replication
(Transcriptome maps for chromosomes 21 and 22 published (Science (2002) May 3; 296: 916-919) )
Transcriptome studies can be done by either

SAGE (serial analysis of gene expression)

Microarrays (the human transcriptome map is available at http://bioinfo.amc.uva.nl/HTM/)
The proteome
Proteome projects worldwide are co-ordinated by the HUPO (Human Protein Organisation) and involve
protein biochemistry on a unprecedented, high-throughput scale. However, the problems associated with
proteomics include limited and variable sample material, sample degradation, abundance, post1
BB30055: genes & genomes
MV Hejmadi 2004-05
translational modifications, huge tissue, developmental and temporal specificity as well as disease and
drug influences.
The main areas of proteomics research are
1) Mass spectrometry-based proteomics: Approaches involves protein separation by 2-D gel
electrophoresis followed by MS of the protein spots and is based on de-novo analysis of proteins from
cells and tissues. MS-based proteomics relies on the discovery of protein ionisation techniques. MSbased proteomics can be used for protein identification and quantification, profiling, protein interactions
and modifications.
Principle of MS: Any MS consists of an ion source,
a mass analyser that measures mass-to-charge ratio
(m/z) of the ionised analytes and a detector that
registers the number of ions at each m/z value.
Electrospray ionsation (ES) and matrix-assisted laser
desortion/ionisation (MALDI) are the 2 techniques
commonly
used
to
volatize/ionise
the
proteins/peptides for MS analysis. MALDI-MS is
used for simple peptide mixtures whereas ESI-MS is
used for complex samples.
2) Array-based proteomics: Based on the cloning and amplification of identified ORFs into homologous
(ideally used for bacterial and yeast proteins) or sometimes heterologous systems (insect cells which
result in post-translational modifications similar to mammalian cells). A fusion tag (short peptide or
protein domain that is linked to each protein member e.g. GST) is incorporated into the plasmid
construct. These constructs can then be used to analyse
a. Protein expression and purification:
b. Protein activity: Analysis can be done using either biochemical genomics or functional protein
microarrays.
c. Protein interaction analysis can be done using methods such as two-hybrid analysis (yeast 2hybrid), FRET (Fluorescence resonance energy transfer), phage display etc
d. Protein localisation: allows understanding protein function in complex cellular networks by
immunolocalisation of epitope-tagged products. E.g the use of GFP or luciferase tags.
3) Structural proteomics and imaging techniques
4) informatics
5) clinical proteomics
Comparative genomics
Comparing genomes of various organisms (e.g mouse) can help in studying human disease genes or help in
mapping other genes.
References for post-genomics (A) and (B)
1) Chapter 7 from Genomes2 by T Brown OR Chapter 19 from Human Molecular Genetics3 by Strachan & Read
2) Science (2001) Vol 291 No5507 pp1257-60
References for (C) proteomics:
1) Nature (13 March 2003). Proteomics insight articles from Vol. 422, No. 6928 pgs 191-197.
2) Genomes 2 by TA Brown, pgs 208-213
Optional Reading
1) Boheler KR and Stern MD. Trends in Biotechnology (Feb 2003) Vol 21(2) pp 55-57. The new role of SAGE in gene
discovery
2
Download