Protein Intrinsic Disorder, Cell Signaling, and Alternative Splicing

advertisement
Protein Intrinsic Disorder,
Cell Signaling and
Alternative Splicing
Center For Computational
Biology and Bioinformatics
Outline of Talk
•
•
•
•
•
•
Examples of intrinsically disordered proteins
Prediction of natural disordered regions
Disorder and cell signaling
Disorder and molecular recognition
Disorder and alternative splicing
Protein isoforms and functional diversity via the
linkage of alternative splicing and intrinsic
disorder
Molecular Recognition Element (MoRE)
CDK
Cyclin A
3D structure from:
Russo A et al., Nature 382:325-331 (1996)
p27kip1
Disorder and Function
Category
Change Examples
Descriptions
Molecular
Recognition
DO
113
Inter- and Intra-protein,
ssDNA, dsDNA, tRNA, rRNA,
mRNA, nRNA, bilayers,
ligands, co-factors, metals
Protein
Modification
Variable
36
Acetylation, fatty acylation,
glycosylation, methylation,
phosphorylation, ADPribosylation, ubiquitination,
proteolytic digestion
Entropic
Chains
Variable
17
Linkers, spacers, bristles,
clocks, springs, detergents,
self-transport
Dunker AK et al., Adv Protein Chem 62: 25-49 (2002)
Prediction of Disorder
Disordered Sequence Data
Attribute Selection or Extraction
Separate Training and Testing Sets
Predictor Training
Predictor Validation on Out-of-Sample Data
Prediction
www.disprot.org
DisEMBLTM
DISOPRED2
DRIPPRED
FoldIndex©
GlobPlot 2
IUPred
PONDR®
PreLink
RONN
VL2
VL3, VL3H,
VL3E
Intrinsic Protein Disorder Prediction
Disorder Prediction Server
Web based predictor for disordered regions in proteins
Estimate the fold probability of a protein
Intrinsic Protein Disorder, Domain & Globularity
Prediction
Prediction of Intrinsically Unstructured Proteins
Predictors of Natural Disordered Regions
Prediction of unfolded segments in a protein sequence
based on amino acid composition
Regional Order Neural Network
DisProt Predictor of Intrinsically Disordered Regions
DisProt Predictor of Intrinsically Disordered Regions
PONDR® VL-XT
Score
p53 MoREs
Oldfield et al., Biochemistry 44: 12454-12470 (2005)
Protein Interaction Domains
http://www.mshri.on.ca/pawson/domains.html
GYF Domain and CD2 Chain B
Freund et al., (2002) Embo J. 21:5985-5995
GYF Domain of CD2 Binding Protein
Freund et al., (1999) Nat. Struct. Biol. 6:656-660
CD2: Binding Partner of GYF Domain
Consensus sequence (GYF binding sites) has the sequence: ppppghr. The
peptide in the crystal structure has the aa sequence: shrppppghrv.
Freund et al., (1999) Nat. Struct. Biol. 6:656-660
Analysis of Signaling Interactions
• Examined each interaction on Pawson’s website.
• Almost all of the interactions involved ordered
regions binding to disordered partners.
• Conclusion: if Pawson’s examples are typical,
then a very significant proportion of protein-protein
signaling interactions use disordered regions.
Parallel Paradigms
Catalysis
AA seq → 3-D Structure → Function
Signaling
AA seq → Disordered → Function
Ensemble
Alternative Splicing and Intrinsic Disorder
• Find proteins with both ordered and disordered
regions.
• Find mRNA alternative splicing information for these
proteins and map to the ordered and disordered
regions.
• For alternatively spliced regions of mRNA, do they
code for ordered protein more often or do they code
for disordered protein more often?
Alternative Splicing
5’ UTR
Coding Sequence
3’ UTR
Alternative Splicing
5’ UTR
mRNA
Protein sequence
Coding Sequence
Transcription
Translation
3’ UTR
Alternative Splicing
5’ UTR
mRNA 1
Isoform 1
Coding Sequence
Transcription
Translation
3’ UTR
mRNA 2
Isoform 2
Alternative Splicing
5’ UTR
mRNA 1
Isoform 1
Coding Sequence
Transcription
Translation
AS region
Folding
3’ UTR
mRNA 2
Isoform 2
Structural Studies of AS
Disordered AS
regions
Pyrophosphorylase
Structured AS regions
RAC1
Tumor necrosis factor
Glutathione S-transferase
Sulphotransferase
Studying the Relationship IDAS
DisProt
Database of proteins with
experimentally determined
structure and disorder
www.disprot.org

ASG
(AS Gallery)
SwissProt
(VarSplic)
ASED dataset:
46 proteins
74 characterized AS regions
>19,000 charaterized residues, 35% ID
Results on ASED
Distribution of structurally characterized AS regions
Enlarging the Dataset
PONDR® VSL1
ID predictor
(> 80% accuracy)
Validation
Analysis
ASED
dataset
ASSP dataset
558 AS human
proteins from
SwissProt
1,266 AS regions
Global Results
AS regions disorder distributions in ASED and ASSP
0.7
ASED experimental
ASED predicted
ASSP predicted
Relative frequency
0.6
0.5
0.4
0.3
0.2
0.1
0
0-20%
20-40%
40-60%
60-80%
Disorder content (AS regions)
80-100%
Alternative Splicing and Disorder
• Ordered Proteins: active site residues non-local in
sequence, become associated by protein folding
• Disordered Proteins and regions: functional
residues localized in squence
• Functional regions for signaling and regulation are
located one after another
• Alternative splicing edits functional sets and
thereby leads to regulatory and signaling diversity
Breast Cancer Protein 1 (BRCA1)
Summary
• Protein signaling interactions involve intrinsic
disorder (ID) a high percentage of the time.
• Alternative splicing (AS) often occurs in regions of
pre-mRNA that code for intrinsic disorder.
• AS + ID facilitate regulatory and signaling diversity.
• Is AS + ID the critical combination for the evolution
of multi-cellular organisms?
Acknowledgements
Indiana University
Temple University
Predrag Radivojac
Pedro Romero
Marc Cortese
Gerard Go
Amrita Mohan
Jie Sun
Siama Zaida
Jack Yang
Zoran Obradovic
Slobodan Vucetic
Vladimir Vacic
Kang Peng
University of Idaho
Celeste J. Brown
Chris Williams
Molecular Kinetics
Vladimir Uversky
Yugong Cheng
Rockefeller University
Lilia Iakoucheva Sebat
University of Wisconsin
John Markley
Chris Oldfield
UCSF
Ethan Garner
PNNL
Richard Smith
Eric Ackerman
Support
•
•
•
•
•
NSF CSE II 9711532
NIH R01 LM007688
USDA 2000 1740
INGEN®, Lilly Endowment
Molecular Kinetics
Download