Chapter 5 part II

advertisement
Bioinformatics, Genomics, and
Proteomics (Part II)
Proteomics
• The comprehensive study of all the proteins of a
cell, tissue, body fluid, or organism from a
variety of perspectives, including structure,
function, expression, profiling, and proteinprotein interactions.
• Insight into the proteins that are present in a cell
or tissue under particular biological conditions
can aid in our understanding of the cell’s
activities.
• Genomic sequence has limitation.
Limitation in Genomics
• Some annotated open reading frames (ORFs)
are subsequently found not to encoded proteins.
• Others encode proteins whose functions cannot
be predicted from the sequence.
• Post translational modifications that influence
the protein function and cellular localization
often cannot be predicted from the sequence.
• mRNA levels do not always correlate with
protein levels, and interactions between proteins
cannot be accessed by genomics.
Proteomics
• On the other hand, a protein’s function can
sometimes be inferred by determining the
condition under which it is expressed and active.
• From a practical stand point, proteomics can be
used to track clinical disorders and detect
targets for therapeutic treatments.
Proteomics - Complications
• In eukaryotes, there are many more proteins than
genes due to the alternative splicing, post
translation modifications, and post transcriptional
modifications to RNA (RNA editing.)
• It is impossible to account experimentally for
every member of a proteome with a single
technique because proteins are susceptible to
degradation; have different properties, including
solubilities; and range considerably in
abundance.
2D PAGE
• First dimension – isoelectric focusing is performed to first
separate the proteins in a mixture on the basis of their net
charge.
• The protein mixture is applied to a pH gradient gel. When
an electric current is applied, protein will migrate to ward
either the anode (+) or cathode (-), depending on their net
charge.
• As proteins move through the pH gradient, they will gain
or lose protons until they reach a point in the gel where
their net charge is zero.
• The pH in this position of the gel is known as the
isoelectric point.
2D PAGE
• Second dimension – separate by molecular weight.
• Several proteins in a sample may have the same
isoelectric point and therefore migrate to the same
position in the gel.
• Proteins are further separated on the basis of
differences in their molecular weights (MW) by
electrophoresis, at a right angle to the first dimension,
through a sodium dodecyl sulfate (SDS)
polyacrylamide gel. Gel is visualized by Coomassie
blue or silver protein stain.
• A 2D polyacrylamide gel can resolve up to 2,000
different proteins.
2D PAGE
2D PAGE
• The pattern of stained spots is captured by
densitometric scanning.
• Databases have been established with images of 2D
PAGE from different cell types.
• Software packages are available for detecting spots,
matching patterns between gels, and quantifying the
protein content of the spots.
• The next task after separation is to excise the
individual proteins from the gel, and to identify as
many of the proteins as possible using mass
spectrometry (MS.)
2D PAGE - Limitation
• Proteins with either low or high molecular
weights, those that are found in cellular
membranes, and those that are present in small
amounts are not readily resolved by 2D PAGE.
• Highly charged proteins, such as ribosomal
proteins and histone proteins, are not separated
by standard conditions.
MALDI - MS
• A spot is excised from the
gel and treated with trypsin.
• Purified trypsin peptides are
separated by MALDI – time
of flight (TOF) MS.
• The set of peptide masses
from the unknown protein
are used to search a
database, and the best
match is determined.
ESI – MS - MS
• A spot is excised from the gel
and treated with trypsin.
• Purified trypsin peptides are
separated according to their
mass/charge (m/z) ratios, and
the amino acid sequence of a
selected peptide is
determined with a MS.
• The unknown protein is
identified by searching a
protein database with the
amino acid sequences from
two or more peptides.
Protein Expression profiling
• The 2D differential in-gel electrophoresis method for
quantitative analysis of protein expression.
• The proteins of two proteomes are labeled with the
fluorescent dyes Cy3 and Cy5, respectively.
• The samples are combined and run on 2D PAGE.
• The gel is scanned for each fluorescent dye, and the
relative levels of two dyes in each protein spot are
recorded.
• The gel is stained with protein dye and unknown spot
is excised and treated with trypsin.
• The peptides are separated by ESI-MS-MS, and the
amino acid sequences are determined.
ICAT- LC - MS - MS
• Proteins from two proteomes
are labeled with light and
heavy ICAT reagent.
• The samples are combined
and treated with trypsin.
• The peptides are captured by
affinity chromatography using
avidin, and fractionated by LC.
• The ration of light:heavy is
determined by MS.
• Amino acid sequences are
determined by ESI-MS-MS.
Protein Microarray
• Conceptually, protein microarrays are similar to DNA
microarrays.
• They consist of large numbers of proteins individually
immobilized in known positions on the coated surface
of glass slide or silicon chip.
• The proteins arrayed can be antibodies specific for
each protein in an organism, purified recombinant
proteins, or short synthetic peptides.
• There are many ways of attaching a protein to a
support surface.
• The major objective of any coupling system is
maintenance of protein structure and function.
Protein Microarray
• Some systems bind proteins to a chemical group that
coat the surface of the support.
• With other protocols, recombinant proteins are
prepared with a short amino acid sequence (tag) at N
or C terminus that bind to a recognition sequence on
the support. In this case, all the protein molecules
are uniformly oriented.
• Instead of spotting proteins on a flat surface, some
microarrays are engineered with tiny depression
(nanowells) that keep each protein moist and prevent
mixing with adjacent proteins.
Protein Microarray
• The purpose of protein microarray analyses is to
detect, on a large scale, the molecules that a protein
interacts with.
• These interacting molecules can be other proteins,
nucleic acid sequences, or low molecular-weight
compounds.
• Protein populations from different samples can be
compared, for example, in control versus treated
samples or in normal versus diseases tissues.
Protein Microarray - Visualizing
• Direct labeling – to label the test samples directly with
a fluorescent dye and then detect the labeled
molecules that bind to the proteins of a microarray
with a laser scanner. Two-dye strategy (e.g. Cy3 or
Cy5) can be used to compare proteins in two different
sample on a single array.
• Sandwich style – the sample molecules are
biotinylated, and after the initial incubation, a
streptavidin-fluorescent-dye conjugate that binds to
biotin to facilitated the detection of sample molecules
is applied.
Protein array detection method
Analytical VS Functional
• Analytical protein microarray. Different types of ligand,
including antibodies, antigens, DNA or RNA aptamers,
carbohydrates or small molecules, with high affinity and
specificity, are spotted down onto a surface.
• These chips can be used for monitoring protein
expression level, protein profiling and clinical diagnostics.
• Similar to the procedure in DNA microarray experiments,
protein samples from two biological states to be
compared are separately labeled with red or green
fluorescent dyes, mixed, and incubated with the chips.
• Spots in red or green color identify an excess of proteins
from one state over the other.
Analytical VS Functional
• Functional protein microarray. Native proteins or peptides
are individually purified or synthesized using highthroughput approaches and arrayed onto a suitable
surface to form the functional protein microarrays.
• These chips are used to analyze protein activities, binding
properties and post-translational modifications.
• With the proper detection method, functional protein
microarrays can be used to identify the substrates of
enzymes of interest.
• Consequently, this class of chips is particularly useful in
drug and drug-target identification and in building
biological networks.
http://www.nature.com/nature/journal/v422/n6928/images/nature01512-f1.2.jpg
Analytical microarray
• Analytical microarrays are used for protein profiling, that
is, detection and quantification of proteins present in a
sample.
• It could be antibody microarray or antigen microarray.
• Antibody microarrays are often probed with proteins from
biological sources, such as plasma or serum, or proteins
that are secreted from cells in culture to determine
disease-specific profiles.
• For example, antibody microarrays that specifically detect
cytokines have been formulated.
Analytical microarray
• Cytokine antibody microarrays are used to examine
cytokines in both normal and diseased states, and from a
variety of sources after various treatments.
• A sandwich immunoassay is often used to detect
cytokines that bind to immobilized antibodies.
• After the microarray is treated, biotynylated cytokine
antibodies are added and bind to the corresponding
captured cytokine.
• For visualization, a streptavidin-fluorescent-dye conjugate
attaches to the biotin of the secondary antibody.
• The signals are detected with a laser scanner.
Cytokine antibody microarray
Analytical microarray
• Plasma samples from individuals with Alzheimer disease
and those from individuals with no dementia were applied
to a microarray made up of antibodies against 120
cytokines.
• Eighteen cytokines were found to be associated with
Alzheimer disease.
• The levels of 7 of these were higher and 11 were lower in
individuals with Alzheimer disease than in the subjects
without dementia.
• Possibly, the Alzheimer disease-specific cytokine
signature may provide basis for a diagnosis test.
Analytical microarray – Antigen
• Another type of analytical microarray is protein (antigens)
microarray. Proteins are attached to a solid support and
then probed with antibodies, mostly in serum samples.
• The purpose of these studies is to discover whether the
production of antibodies against specific proteins
correlates with particular diseases or biological process.
• A microarray of 5,000 different human proteins was
created and used to determine if serum from ovarian
cancer patients has a distinctive set of antibodies in
comparison to the antibody population of healthy
individuals.
Analytical microarray – Antigen
• The initial results revealed 94 proteins that were
specific ally recognized by antibody in the sera from
the ovarian cancer patients.
• With further testing, three proteins were consistently
found to be specific for ovarian cancer.
• The ovarian-cancer-specific proteins may help in the
early detection of the disease.
• The earlier ovarian cancer is diagnosed, the better
the chance of survival.
Analytical microarray
• Analytical antibody microarray is also used to detect
whether posttranslational modifications, such as
phosphorylation of tyrosine or glycosylation, are
associated with specific diseases.
• Proteins are fist captured by primary antibodies
immobilized on a microarray.
• Then, the microarray is flooded with biotynylated antiphosphotyrosine antibody.
• Next, streptavidin conjugated with a fluorescent dye is
added, and the protein spot with the fluorescent is
detected.
• Detection of glycan group is performed in similar manner.
Analytical microarray – Reverse phase
• A multiprotein sample, for example, from a cell lysate or
tissue specimen, is immobilized in a single spot on a
support.
• Several such multiprotein samples are spotted on the
microarray.
• Then, the microarray is probed with a single target
molecule.
• The advantage is that a large number of samples can be
compared at one time.
• With a reverse-phase microarray, the presence of specific
proteins in multiple complex samples can be readily
determined.
Reverse-phase microarray format
Functional protein microarray
• Functional protein microarrays feature large sets of
individual proteins that are used predominately to
determine interactions with other proteins or low
molecular-weight compounds, such as lipids, drugs, and
metabolites.
• Ideally, the functional protein array should consist of all
possible proteins of a proteome under study.
• To obtain comprehensive representation of a proteome, a
library containing all of the protein coding sequences is
first constructed.
• A library of cloned protein-encoding ORFs has been
dubbed an ORFeome.
Functional protein microarray
• The starting point for producing an ORFeome is
usually PCR amplification of the coding
sequences for cloning into a vector.
• For prokaryotic organisms, the protein-coding
sequences can often be readily identified from
genomic sequences.
• On the other hand, full-length cDNA libraries are
the primary sources of the coding sequences of a
eukaryotic proteome.
Integration and excision of
bacteriophage λ into and
from the E. coli genome via
recombination between
attachment (att) sites in the
bacteria and bacteriophage
DNA.
Primer pair used to amplify ORFs for
recombinational cloning to generate an
ORFeoem.
Recombinational cloning
• Primer pair is used to amplify ORFs resulting in PCRamplified ORF with attachment sites (attB1 and
attB2).
• Recombination between PCR-amplified ORF and a
donor vector with attP1 and attP2 sites on either side
of the ccdB gene results in an entry clone in which
ORF is flanked by attL1 and attL2 sites.
• The selectable marker (SM1)selects transformed
cells with an entry clone.
• The protein encoded by ccdB is toxic to transformed
cells with non-recombined donor vector molecule.
Recombinational cloning
• The next step is the expression of each cloned ORF.
• Recombination between the entry clone with attL1
and attL2 sites and a destination vector with attR1
and attR2 results in an expression clone with attB1
and attB2 sites flanking the ORF.
• The selectable marker (SM2) selects transformed
cells with an expression clone.
• Cells with an intact destination vector that did not
undergo recombination are killed by CcdB protein.
• For construction of a microarray, each protein
encoded by ORF is isolated by affinity purification.
Protein – protein interaction mapping
• Proteins seldom act alone. On average, one protein
interacts with five others.
• The two-hybrid method is used to determine pairwise
protein-protein interactions.
• The underlying principle of this assay is that the
physical connection between two proteins reconstitutes
an active transcription factor that initiates the
expression of a reporter gene.
• Generally transcription factors have two domains, DNA
binding domain and activation domain.
• These two domains need not to be part of the same
protein to be functioning.
Yeast Two-Hybrid System
• The availability of complete genome sequences, make
it possible to use the yeast two-hybrid system to screen
for all possible interactions between the proteins in an
organism rather than to test one bait at a time.
• The ORFs from an organism’s genome are cloned into
two plasmid vectors, one that expresses the bait
(target) and another that produces the prey (interacting
proteins are to be identified.)
• Each is introduced into yeast cells by transfection.
• A high-throughput mating method is then used to
introduce each bait plasmid into yeast cells with each
prey plasmid, and the hybrids are screened for
expression of the reporter gene.
http://www.sumanasinc.com/webcontent/animatio
ns/content/yeasttwohybrid.html
Complementation assay
for detecting pairwise
protein interactions in
mammalian cells.
Large-scale screens for
protein interactions using
the yeast two-hybrid
system. Two libraries are
prepared, one containing
genomic DNA fragments or
cDNAs fused to the DNA
coding sequence for the
DNA binding domain (bait
library) and the other fused
to activation domain (prey
library.)
Protein interaction map of calcium signaling protein
clusters of D. melanogaster.
Protein Arrays
• The yeast two-hybrid system is powerful but it has
some shortcomings.
• The assay is based on transcription, so the bait and
the prey proteins must enter the nucleus and
interact in a cellular location very different from their
normal environment.
• Microarray can be used to screen for proteinprotein interactions.
Protein Arrays
• All the ORFs from the yeast genome are used to
express each yeast protein tagged with a glutathioneS-transferase (GST) epitope.
• GST-tagged proteins are purified and spotted onto
glass slides to generate protein microarrays.
• The protein under investigation is labeled and added to
the array under gentle conditions that allow the proteins
to interact.
• The spots on microarrays are then analyzed for the
intensity of the signal from the labeled interacting test
protein.
TAP tag procedure for protein interactions
• Two DNA sequences (tag1 and tag2), each encoding a
short amino acid sequence with high affinity for a
specific molecule, are cloned together and fused in
frame to the 3’ end of a cDNA.
• The tagged cDNA construct is introduced into a host
cell, where it is transcribed and translated.
• Other cellular proteins bind to the protein encoded by
cDNA X. The complex interacting proteins are
separated by the binding of tag1 to its affinity partner.
• The cluster is eluted from the affinity partner by
cleaving off tag 1.
TAP tag procedure for protein interactions
• A second purification step is carried out with tag 2 and
its affinity partner.
• The proteins of the cluster are separated by onedimensional PAGE.
• Single bands are excised from the gel and treated with
trypsin.
• Peptide amino acid sequences are obtained with ESIMS-MS and searched against a protein database.
Download