bchm628_lect6_15

advertisement
Protein-protein Interactions
June 18, 2015
Why PPI?
 Protein-protein interactions determine
outcome of most cellular processes
 Proteins which are close homologues often
interact in the same way
 Protein-protein interactions place evolutionary
constraints on protein sequence and structural
divergence
 Pre-cursor to networks
PPI classification
 Strength of interaction
 Permanent or transient
 Specificity
 Location within polypeptide chain
 Similarity of partners
 Homo- or hetero-oligomers
 Direct (binary) or a complex
 Confidence score
Determining PPIs
 Small-scale methods
 Co-immunoprecipitation
 Affinity chromatography
 Pull-down assays
 In vitro binding assays

FRET, Biacore, AFM
 Structural (co-crystals)
PPIs by high-throughput methods
 Yeast two hybrid systems
 Affinity tag purification followed by mass
spectrometry
 Protein microarrays
 Microarrays/gene co-expression
 Implied functional PPIs
 Synthetic lethality
 Genetic interactions, implied functional PPIs
Yeast two hybrid system
Gal4 protein comprises DNA
binding and activating domains
Binding domain
interacts with
promoter
Activating domain
interacts with
polymerase
Measure reporter enzyme activity (e.g. blue colonies)
Yeast two hybrid system
•Gal4 protein: two domains do not need to be transcribed
in a single protein
•If they come into close enough proximity to interact,
they will activate the RNA polymerase
Two other protein domains (A & B) interact
Binding domain
interacts with
promoter
A
B
Activating domain
interacts with
polymerase
Measure reporter enzyme activity (e.g. blue colonies)
Yeast two hybrid system
 This is achieved using gene fusion
 Plasmids carrying different constructs can be expressed in
yeast
Binding domain as a translational
fusion with the gene encoding
another protein in one plasmid.
A
Activating domain as a
translational fusion with the gene
encoding a different protein in a
second plasmid.
B
If the two proteins interact, then GAL4 is expressed and blue colonies form
Yeast two hybrid
 Advantages
 Fairly simple, rapid and inexpensive
 Requires no protein purification
 No previous knowledge of proteins needed
 Scalable to high-throughput
 Is not limited to yeast proteins
 Limitations
 Works best with cytosolic proteins
 Tendency to produce false positives
Mass spectrometry
 Need to purify protein or protein complexes
 Use a affinity-tag system
 Need efficient method of recovering fusion protein
in low concentration
TAP (tandem affinity purification)
Spacer CBP
PCR product
TEV site
Protein A
Homologous
recombination
Chromosome
Fusion protein
Protein
Spacer CBP
TEV site
Protein A
Calmodulin
binding peptide
TAP process
"Taptag simple" by Chandres - Own work.
Licensed under CC BY-SA 3.0 via Wikimedia Commons
TAP
 Advantages
 No prior knowledge of complex
composition
 Two-step purification increases specificity of pull-down
 Limitations
 Transient interactions may not survive 2 rounds of
washing
 Tag may
prevent interactions
 Tag may
affect expression levels
 Works less
efficiently in mammalian cells
Other tags
 HA, Flag and His
 Anti-tag antibodies can interfere with MS analysis
 Streptavidin binding peptide (SBP)
 High affinity for streptavidin beads
 10-fold increase in efficiency of purification
compared to conventional TAP tag
 Successfully used to identify components of
complexes in the Wnt/b-catenin pathway
Used Dsh-2 and
Dsh-3 as bait
proteins
The KLHL12-Cullin-3 ubiquitin
ligase negatively regulates Wnt-bcatenin pathway by targeting
Dishevelled for degradation
Nature Cell Biology 4:348-357 (2006)
Binding partners of Bruton’s tyrosine kinase
Role in lymphocyte development &
B-cell maturation
Protein Science 20:140-149 (2011)
Databases of protein-protein interactions
 MINT – Molecular Interaction Database
 >240,000 interactions with 35,000 proteins
 Covers multiple
speces
 DIP -- Database of Interacting Proteins (UCLA)
 >79,000 interactions with >27,000 proteins
 CCSB – Proteomics base interactomes (Harvard)
 Human, viruses, C. elegans,
 Some
S. cerevisiae
unpublished data
 IntAct – EBI molecular interaction database
 Curated data from multiple sources
Integrated Databases of PPIs
 MiMI: Michigan Molecular Interactions
 Data merged from several PPI databases; source
provenance maintained
 Links to literature sources for the PPI
 Linked to Entrez Gene, InterPro, Gene ontology
 Includes pathway data
 Various methods of viewing the data
 NOT CURATED
 Data only as good as source data
http://mimi.ncibi.org
MiMI database
MiMI search results
MiMI Gene Detail
Gene Ontology
Interactions
Pathways
KEGG pathway
Each protein
name is a link
to another
page
Arrows & lines
provide information
about the type of
interaction
Other viewing options
MeSH terms
that involve
this gene
PPI with this
gene in
Cytoscape
Adaptive
PubMed
search
 On average, two databases curating the same
publication agree on 42% of their interactions.
Discrepancies between sets of proteins annotated
from the same publication are less pronounced,
with an average agreement of 62%, but the overall
trend is similar
 Better agreement on non-vertebrate model
organisms data sets than for vertebrates
 Isoform complexity is a major issue
Literature curation of protein interactions: measuring agreement across
databases. Turinsky A.L. et. al. Database, Vol. 2010, Article ID baq026
iRefWeb
 Web interface to integrated database of protein-
protein interactions
 Better review of the records after pulling in the
data from the various source databases
 Can search by gene name or various IDs, including
batch searches.
 Does not have the pathway and other information,
but has a better measure of confidence of PPI
http://wodaklab.org/iRefWeb/
iRef Web search
The search will try to match automatically, both name
and species.
MI score: (Mint-inspired) score is a measure of confidence
in molecular interactions for interactions between A and B:
1. Total number of unique PubMed publications that support the
interactions
2. Cumulative sum of weighted evidence from all
3. The cumulative sum of weighted evidence from all interologs, i.e.
interactions containing homologous pairs A' and B'.
Interaction detail
STRING database
 Search Tool for the Retrieval of Interacting Genes
 Integrates information from existing PPI data
sources
 Provides confidence scoring of the interactions
 Periodically runs interaction prediction algorithms
on newly sequenced genomes
 v.10 covers >2000 organisms
http://string-db.org/
Networks in STRING database
Starting protein
Networks can be expanded
3 indirect
interactions
Information about the proteins
Transferring PPI annotation
 Most of the high-throughput PPI work is done in
model organisms
 Can you transfer that annotation a homologous
gene in a different organism?
Defining homologs
Orthologue of a protein is usually defined as the bestmatching homolog in another species
 Candidates with significant BLASTP E-value (<1020)
 Having ≥80% of residues in both sequences
included in BLASTP alignment
 Having one candidate as the best-matching
homologue of the other candidate in
corresponding organism
Interologs
 If two proteins, A and B, interact in one organism and their
orthologs, A’ and B’, interact in another species, then the
pair of interactions A—B and A’—B’ are called interologs
 Align the homologs (A & A’, B & B’) to each other.
 Determine the percent identity and the E-value of both
alignments
 Then calculate the Joint identity and the Joint Evalue
Joint identity
J I = I AA' ´ I BB'
Joint E-value
J E = EAA' ´ EBB'
Transfer of annotation
 Compared interaction datasets between yeast,
worm and fly
 Assessed chance that two proteins interact with
each other based on their joint sequence identities
 Performed similar analysis based on joint E-values
 All protein pairs with JI ≥ 80%
with a known interacting
pair will interact with each other
 More than half of protein pairs with JE  E-70 could be
experimentally verified.
Yu, H. et. al. (2004) Genome Res. 14: 1107-1118
PMID: 15173116
Examples of Protein-Protein Interologs
 In C. elegans, mpk-1 was experimentally shown to
interact with 26 other proteins (by yeast 2-hybrid)
 Ste5 is the homolog of Mpk-1 in S. cerevisiae
 Based on the similarity between the interaction
partners of mpk-1 and their closest homologs in S.
cerevisiae, the interolog approach predicted 5 of
the 6 subunits of the Ste5 complex in S. cerevisiae
 This paper has been cited >100 times
 Why the interest in predicting protein-protein
interactions?
 Determining protein-protein interactions is
challenging and the high-throughput (genomewide) methods are still difficult and expensive to
conduct
 Identifying candidate interaction partners for a
targeted pull-down assay is a more viable strategy
for most labs
BIPS: BIANA Interolog Prediction Server
• Based on concept of
interolog
• Pre-defined
alignments
• Can submit list of
proteins to get
predicted interaction
partners
• Can filter predicted
list to increase
confidence
Today in computer lab
 Tutorial on finding PPIs in your gene list using
MiMI or iRefWeb
 Exploring a subset of PPIs using the STRING
database
 Prediction of interactions homologs using the BIPS
server
 Exercise 4 on protein domain analysis
Download