MS/MS Protein view

advertisement
How we analyzed proteomic data?
Session III
台大生技教改暑期課程
Topics for session III
Image analysis (for 2-DE gel)
2. Mass data analysis
3. Protein structure analysis
1.
1. Image analysis
Examples of 2-DE results
Healthy control
Patient
D
Digest to peptide fragment
MS analysis
Unexpected variation between gels
Interior variation between 2-DE experiments
– Same loading amount?
– Same gel condition?
– Same staining condition?
Exterior variation after gel developed
– Unwanted spots (dye or reagent deposit)
– Dirty spots (hair, dust)
What image analysis software do?
–
–
–
–
–
–
–
–
spot detection
unwanted spot filtering
background subtraction
normalization
image matching
expression comparison
pI/MW calibration
data organization
Currently available
2-DE image analysis software
– Melanie 4
–
–
–
–
–
–
(Swiss Institute of Bioinformatics, SIB)
Phoretix 2D (Nonlinear dynamics)
Progenesis (Nonlinear dynamics)
Z3 and Z4000 (Compugen)
Delta2D (Sunergia group)
A-GelFox 2D (Alpha innotech)
Flicker (NCI, through internet)
Spot detection
One of the first and most important steps in
2-DE analysis.
 Locating the spots in the gel image
 defining their shape
 calculating measurement information
(volume and area)
Spot detection
Filtering
Removal of unwanted spots:
What’s unwanted spots:
 dust on gel
 stain deposit
 bulky spots
Background subtraction
General background subtraction method:
 No background
 Mode of non-spot
 Manual background
 Lowest boundary
 Average boundary
Normalization
General normalization method:
 Total spot volume
 Single spot
 Total volume ratio
Image matching
Expression comparison
Two fold up or down expression are thought
to be significant.
pI/MW calibration
observed or experimental pI/MW
PI calibration
MW calibration
Data organization
Spot annotation
2. Mass data analysis
Useful proteomic resource
http://tw.expasy.org/
Useful proteomic resource
Matrix Science - Mascot
http://www.matrixscience.com/
Three major functions in Mascot
 Peptide Mass Fingerprint (PMF): The experimental
data are a list of peptide mass values from an enzymatic
digest of a protein. (MALDI-TOF)
 Sequence Query: One or more peptide mass values
associated with information such as partial or ambiguous
sequence strings, amino acid composition information,
MS/MS fragment ion masses, etc. A super-set of a
sequence tag query.
 MS/MS Ion Search: Identification based on raw MS/MS
data from one or more peptides. (LC/MS/MS)
Difference between MALDI-TOF and
LC/MS/MS
MALDI-TOF
LC/MS/MS
2-1 PMF analysis
Raw data for PMF
m/z
899.2076
905.2126
909.1917
915.2181
925.4637
938.3972
1044.3007
1050.3141
1060.2797
1066.2889
1072.2991
1078.3169
1084.2657
1088.2793
1094.2947
1104.2638
1110.2837
1271.3163
Relative intensity
2980.8123
1471.3723
2312.2317
1533.8486
1881.7635
1528.9462
2111.9482
2396.1550
4689.0698
7302.0029
5688.8511
8919.1113
1474.5900
3180.5122
4573.5195
1546.4652
1470.9734
1498.0187
Mascot PMF query form
Mascot PMF parameters
 Your name; Email
 Search title
 Database
 Taxonomy
 Enzyme
 Monoisotopic or Average
 Modifications
 Protein Mass
 Peptide tol. ±
 Mass values
 Missed cleavages
 Data file
 Query
 Database
Database
Comment
EST
EST divisions of Genbank,
(currently EST_human, EST_mouse, EST_others)
MSDB
Comprehensive, non-identical protein database
NCBInr
Comprehensive, non-identical protein database
OWL
Non-identical protein database (obsolete)
Random
Random sequences for verifying scoring statistics
SwissProt
High quality, curated protein database
 Taxonomy
Ensure the hit list will only contain entries from
the selected species
 speed up a search
 bring a weak match
 Enzyme
Name
Cleave
Don't cleave
N or C term
Trypsin
KR
P
CTERM
Arg-C
R
P
CTERM
Asp-N
BD
NTERM
Asp-N_ambic
DE
NTERM
Chymotrypsin
FYWL
CNBr
M
CTERM
Formic_acid
D
CTERM
Lys-C
K
Lys-C/P
K
CTERM
PepsinA
FL
CTERM
Tryp-CNBr
KRM
P
CTERM
TrypChymo
FYWLKR
P
CTERM
Trypsin/P
KR
V8-DE
BDEZ
P
CTERM
V8-E
EZ
P
CTERM
CNBr+Trypsin
P
P
None
see notes
semiTrypsin
see notes
CTERM
CTERM
M
KR
CTERM
CTERM
P
CTERM
 Enzyme
"None" is not an allowed choice for a Peptide Mass
Fingerprint, where the specificity of an enzyme is essential.
If the search fails to produce a positive match, then try
again with semiTrypsin (below) before resorting to "None".
"semiTrypsin" means that Mascot will search for peptides
that show tryptic specificity (KR not P) at one terminus, but
where the other terminus may be a non-tryptic cleavage.
This is a half-way house between choosing "Trypsin" and
"None".
 Monoisotopic or Average
 nominal mass values: calculated from integer atomic weights. (H=1, C=12,
N=14, O=16), not practical in proteomics.
 Average mass values: equivalent to taking the centroid of the complete
isotopic envelope
 Monoisotopic mass value: the mass of the first peak of the isotope distribution.
 Monoisotopic or Average
For peptides and proteins, the difference between an average and a
monoisotopic weight is approximately 0.06%.
Insulin (5.8 kD)
Albumin (66.4 kD)
 Monoisotopic
Tol: 1 Da
Monoisotopic MW
-1.01
 Average
Tol: 1 Da
Average MW
 Monoisotopic
Tol: 2 Da
 Average
Tol: 2 Da
 Modifications
 Most protein samples exhibit some degree of modification.
 Natural post-translational modifications: phosphorylation and
glycosylation.
 Deliberate modifications: deliberately introduced during sample
work-up, such as cysteine derivatisation.
 Modifications
Fixed modifications are applied universally, to every instance of the
specified residue(s) or terminus.
Example: Carboxymethyl (Cys) means that all calculations will use 161 Da as
the mass of cysteine.
Variable modifications are those which may or may not be present.
Example: if Oxidation (Met) is selected, and a peptide contains 3 methionines,
Mascot will test for a match with the experimental data for that peptide
containing 0, 1, 2, or 3 oxidised methionine residues.
 Modifications
Fixed modifications are applied universally, to every instance of the
specified residue(s) or terminus.
Example: Carboxymethyl (Cys) means that all calculations will use 161 Da as
the mass of cysteine.
Variable modifications are those which may or may not be present.
Example: if Oxidation (Met) is selected, and a peptide contains 3 methionines,
Mascot will test for a match with the experimental data for that peptide
containing 0, 1, 2, or 3 oxidised methionine residues.
 Peptide tol. ±
The error window on experimental peptide mass values
%
fraction expressed as a percentage
mmu
absolute milli-mass units, i.e. units of .001 Da
ppm
fraction expressed as parts per million
Da
absolute units of Da
 Missed cleavages
Missed cleavage = 0, complete digestion
Missed cleavage >=1, incomplete digestion
Submit and processing
Concise protein summary
protein summary
PMF protein view (I)
Protein name
Score and Expect
MW and pI
coverage
PMF protein view (II)
Match peptides
RMS error
No match peptides
Protein information
2-2 MS/MS analysis
Raw data for MS/MS
Parent ion
Daughter ion
Mascot MS/MS query form
Protein summary
Most possible candidate
MS/MS Protein view (I)
The sum of all highest scores within each peptide group
MS/MS Protein view (II)
Protein score: The sum of all highest scores within each peptide group
Peptide view
3. Protein structure analysis
Research Collaboratory for
Structural Bioinformatics (RCSB)
Protein data bank (PDB)
Proteosome
Example: 3D structure for proteosome
Example: 3D structure for proteosome
Example: 3D structure for proteosome
Example: 3D structure for proteosome
Stereo view
Rotation
Secondary Structure
Download