Talk 2 - European Bioinformatics Institute

advertisement
MS Identification
Dr. Juan Antonio VIZCAINO
PRIDE Group coordinator
PRIDE team, Proteomics Services Group
PANDA group
European Bioinformatics Institute
Hinxton, Cambridge
United Kingdom
EBI is an Outstation of the European Molecular Biology Laboratory.
Overview …
• Search engines: peptide identification
• Protein inference
• De novo and spectral searches
• Choosing the right protein sequence DB
• You need to learn many things…
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
It should not be a black box…
From: Lilley et al., Proteomics, 2011
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
MS proteomics: Shot-gun/bottom-up approaches
peptides
MS/MS analysis
100
sequence
database
%
0
proteins
100
300
500
700
900
1100
1300
1500
1700
1900
fragmentation
100
MS analysis
%
0
300
400
Juan A. Vizcaíno
juan@ebi.ac.uk
500
600
700
800
900
1000
1100
m/z
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
2100
m/z
P
R
O
T
O
C
O
L
PMF IDENTIFICATION
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Peptide Mass Fingerprinting (MS)
MS analysis
100
Peptide Mass
Fingerprinting
(PMF)
%
MW
0
300
400
500
600
700
800
900
1000
1100
m/z
- Each peak in the spectrum represents a
peptide (or mixture of peptides)
- Information about the Mass and Charge
Not very used at present except for
Gel Based approaches
(in this case the Molecular Weight of the
protein is known)
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Peptide Mass Fingerprinting (MS) in the web
Aldente (Phenyx): http://www.expasy.org/tools/aldente/
ASCQ_ME: https://www.genopole-lille.fr/logiciel/ascq_me/
Bupid: http://zlab.bu.edu/Amemee/
Mascot: http://www.matrixscience.com/search_form_select.html
MassSearch: http://www.cbrg.ethz.ch/services/MassSearch
MS-Fit (Protein Prospector):
http://prospector.ucsf.edu/prospector/mshome.htm
PepMAPPER: http://www.nwsr.manchester.ac.uk/mapper/
Profound (Prowl): http://prowl.rockefeller.edu/prowl-cgi/profound.exe
XProteo: http://xproteo.com:2698/
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
MS/MS IDENTIFICATION
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
MS/MS
MS analysis
100
Peptide Mass
Fingerprinting
(PMF)
%
0
300
400
500
600
700
800
900
1000
m/z
1100
Fragmentation
Peptide sequence
information
(on top of Mass and
Charge)
Juan A. Vizcaíno
juan@ebi.ac.uk
100
MS/MS analysis
%
0
100
300
500
700
900
1100
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
1300
1500
1700
1900
2100
m/z
Three types of MS/MS identification
Protein database based comparison
database
theoretical
spectrum
sequence
compare
experimental
spectrum
Sequential comparison: de novo approaches
database
compare
sequence
de novo
sequence
experimental
spectrum
Spectral comparison
Spectral
library
experimental
spectrum
compare
experimental
spectrum
Modified From: Eidhammer, Flikka, Martens, Mikalsen – Wiley 2007
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
MS proteomics: peptide IDs and protein IDs
100
100
100
%
100
%
100
0
%
100
0
300
500
%
700 100
900
100
300
500
%
700 100
900
100
300
500
%
0
0
100
0
1100 1300 1500 1700 1900 2100
700 100
900
300
500
%
100
300
500
%
700 100
900
100
300
500
%
0
100
0
m/z
1100 1300 1500 1700 1900 2100
700 100
900
0
m/z
1100 1300 1500 1700 1900 2100
m/z
1100 1300 1500 1700 1900 2100
m/z
1100 1300 1500 1700 1900 2100
700 100
900
m/z
1100 1300 1500 1700 1900 2100
m/z
m/z
300
500
%
700 100
900
100
300
500
%
700
900
1100 1300 1500 1700 1900 2100
100
300
500
%
700
900
0
0
100
0
1100 1300 1500 1700 1900 2100
m/z
1100 1300 1500 1700 1900 2100
m/z
m/z
300
500
700
900
1100 1300 1500 1700 1900 2100
100
300
500
700
900
1100 1300 1500 1700 1900 2100
100
300
500
700
900
0
m/z
1100 1300 1500 1700 1900 2100
m/z
MS/MS spectra
proteins
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
MS proteomics: peptide IDs and protein IDs
100
100
100
%
100
%
100
0
%
100
0
300
500
%
700 100
900
100
300
500
%
700 100
900
100
300
500
%
0
0
100
0
1100 1300 1500 1700 1900 2100
700 100
900
300
500
%
100
300
500
%
700 100
900
100
300
500
%
0
100
0
m/z
1100 1300 1500 1700 1900 2100
700 100
900
0
m/z
1100 1300 1500 1700 1900 2100
m/z
1100 1300 1500 1700 1900 2100
m/z
1100 1300 1500 1700 1900 2100
700 100
900
m/z
1100 1300 1500 1700 1900 2100
m/z
m/z
300
500
%
700 100
900
100
300
500
%
700
900
1100 1300 1500 1700 1900 2100
100
300
500
%
700
900
0
0
100
0
1100 1300 1500 1700 1900 2100
m/z
1100 1300 1500 1700 1900 2100
m/z
m/z
300
500
700
900
1100 1300 1500 1700 1900 2100
100
300
500
700
900
1100 1300 1500 1700 1900 2100
100
300
500
700
900
0
m/z
1100 1300 1500 1700 1900 2100
m/z
MS/MS spectra
proteins
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
MS proteomics: peptide IDs and protein IDs
100
100
100
%
100
%
100
0
%
100
0
300
500
%
700 100
900
100
300
500
%
700 100
900
100
300
500
%
0
0
100
0
1100 1300 1500 1700 1900 2100
700 100
900
300
500
%
700 100
900
300
500
%
700 100
900
100
300
500
%
0
100
0
m/z
1100 1300 1500 1700 1900 2100
100
0
m/z
1100 1300 1500 1700 1900 2100
m/z
1100 1300 1500 1700 1900 2100
700 100
900
sequence
database
m/z
1100 1300 1500 1700 1900 2100
m/z
1100 1300 1500 1700 1900 2100
m/z
m/z
300
500
%
700 100
900
100
300
500
%
700
900
1100 1300 1500 1700 1900 2100
100
300
500
%
700
900
0
0
100
0
1100 1300 1500 1700 1900 2100
m/z
1100 1300 1500 1700 1900 2100
m/z
m/z
300
500
700
900
1100 1300 1500 1700 1900 2100
100
300
500
700
900
1100 1300 1500 1700 1900 2100
100
300
500
700
900
0
UniProt
IPI
RefSeq
MS/MS spectra
peptides
m/z
1100 1300 1500 1700 1900 2100
m/z
Search
engine
TDMDNQIVVSDYAQ
MDR
LFDQAFGLPR
AKPLMELIER
DESTNVDMSLAQR
DIVVQETMEDIDK
NGMFFSTYDR
GTAGNALMDGASQL
proteins
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
SEARCH ENGINES
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Search engines
UniProt
IPI
RefSeq
sequence
database
Proteins
TDMDNQIVVSDYAQMDR
LFDQAFGLPR
AKPLMELIER
DESTNVDMSLAQR
DIVVQETMEDIDK
NGMFFSTYDR
GTAGNALMDGASQL
VDMSLAQR
DIVVQETMEDIDK
…
Peptides
100
100
100
%
100
%
0
0
100
300 % 500
700
100
900
1100
1300
1500
1700
1900
2100
100
300 % 500
0
0
100
300 % 500
0
700
100
100
900
1100
300 % 500
0
100
1300
700
1500
100
900
1700
1100
300 % 500
1900
1300
700
700
100
900
1100
1300
1500
1700
1900
2100
1500
100
900
300 % 500
700
100
900
1100
1300
100
300 % 500
1900
1300
700
1500
m/z
1700
1900
2100
1500
Spectra
m/z
1700
1900
2100
m/z
0
100
300 % 500
0
100
700
100
900
100
900
1100
1300
1500
1700
1900
2100
1100
300 % 500
100
1300
700
100
300 % 500
700
900
1100
1300
1500
1500
100
900
m/z
1700
1100
300 % 500
1900
1300
700
100
900
1700
1900
2100
100
100
300
500
700
900
1100
1300
Experimental
Spectra
Juan A. Vizcaíno
juan@ebi.ac.uk
m/z
1700
1100
300 % 500
1900
1300
2100
1500
m/z
1700
1500
700
900
1100
1900
1300
2100
1500
m/z
1700
1900
2100
m/z
m/z
0
0
2100
1500
m/z
0
0
2100
m/z
1700
1100
100
0
0
2100
m/z
1700
1900
2100
100
300
500
700
900
1100
1300
1500
1700
m/z
Sequence database matching
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Theoretical
Spectra
1900
2100
m/z
Search engines
800
1200
1600
2000
800
2400
Experimental
Spectra
2000
Theoretical
Spectra
How good is the correlation?
-Scores are generated by search engines
-Usually the best match is kept
juan@ebi.ac.uk
1600
m/z
m/z
Juan A. Vizcaíno
1200
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
2400
Search engines
Taken from Nesvizhskii, J Proteomics, 2010
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Search engines
Taken from Nesvizhskii, J Proteomics, 2010
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
The most popular algorithms
• MASCOT (Matrix Science)
http://www.matrixscience.com
• SEQUEST (Scripps, Thermo Fisher Scientific)
http://fields.scripps.edu/sequest
• X!Tandem (The Global Proteome Machine Organization)
http://www.thegpm.org/TANDEM
• OMSSA (NCBI)
http://pubchem.ncbi.nlm.nih.gov/omssa/
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Overall concept of scores and cut-offs
Incorrect identifications
Threshold score
Correct
identifications
False negatives
False positives
Adapted from: www.proteomesoftware.com – Wiki pages
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Playing with probabilistic cut-off scores
higher stringency
6%
100%
90%
5%
80%
identifications
4%
70%
60%
3%
50%
false positives
2%
40%
30%
20%
1%
10%
0%
0%
p=0.05
Juan A. Vizcaíno
juan@ebi.ac.uk
p=0.01
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
p=0.005
p=0.0005
SEQUEST
• Very well established search engine
• Can be used for MS/MS (PFF) identifications
• Based on a cross-correlation score (includes experimental peak height)
• Published core algorithm (patented, licensed to Thermo Fisher Scientific)
• Provides preliminary (Sp) score, rank, cross-correlation score (XCorr),
and score difference between the top tow ranks (deltaCn, Cn)
• Thresholding is up to the user, and is commonly done per charge state
• Many extensions exist to perform a more automatic validation of results
CrossCorr
XCorr =
avg AutoCorr offset=-75 to 75


deltaCn=
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
XCorr 1  XCorr 2
XCorr 1
Search engines: Sequest
It measures how good the XCorr is
relative to the next best match.
Juan A. Vizcaíno
juan@ebi.ac.uk
The XCorr is high if the direct
comparison is significantly greater than
the background
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Search engines: Mascot
• Very well established search engine
• Can do MS (PMF) and MS/MS (PFF) identifications
• Based on the MOWSE score
• Unpublished core algorithm (trade secret)
• Predicts an a priori threshold score that identifications need to pass
• From version 2.2, Mascot allows integrated decoy searches
• Provides rank, score, threshold and expectation value per identification
• Customizable confidence level for the threshold score
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Search engines: Mascot
www.matrixscience.com
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Search engines: X!Tandem
• Open source search engine
• Can be used for MS/MS experiments
• Based on a hyperscore, than only takes into account
b and y ions.
• Published core algorithm and it is freely available
• Fast and able to handle PTMs in an iterative fashion
• Used as an auxiliary search engine
by-Score= Sum of intensities of peaks matching
B-type or Y-type ions
HyperScore=
Juan A. Vizcaíno
juan@ebi.ac.uk
by-Score N !  N !
y
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
b
Search engines: OMSSA
•
•
•
•
•
Open source search engine
Can be used for MS/MS experiments
Relies on a Poisson distribution
Published core algorithm and it is freely available
Provides an expectancy score, similar to the BLAST
E-value
• Very good performance in comparison with the
others
• Used as an auxiliary search engine
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
MS proteomics: peptide IDs and protein IDs
100
100
100
%
100
%
100
0
%
100
0
300
500
%
700 100
900
100
300
500
%
700 100
900
100
300
500
%
0
0
100
0
1100 1300 1500 1700 1900 2100
700 100
900
300
500
%
700 100
900
300
500
%
700 100
900
100
300
500
%
0
100
0
m/z
1100 1300 1500 1700 1900 2100
100
0
m/z
1100 1300 1500 1700 1900 2100
m/z
1100 1300 1500 1700 1900 2100
700 100
900
sequence
database
m/z
1100 1300 1500 1700 1900 2100
m/z
1100 1300 1500 1700 1900 2100
m/z
m/z
300
500
%
700 100
900
100
300
500
%
700
900
1100 1300 1500 1700 1900 2100
100
300
500
%
700
900
0
0
100
0
1100 1300 1500 1700 1900 2100
m/z
1100 1300 1500 1700 1900 2100
m/z
m/z
300
500
700
900
1100 1300 1500 1700 1900 2100
100
300
500
700
900
1100 1300 1500 1700 1900 2100
100
300
500
700
900
0
peptides
m/z
1100 1300 1500 1700 1900 2100
m/z
Search
engine
MS/MS spectra
So far, we have actually
identified peptides, not proteins
Juan A. Vizcaíno
juan@ebi.ac.uk
UniProt
IPI
RefSeq
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
TDMDNQIVVSDYAQ
MDR
LFDQAFGLPR
AKPLMELIER
DESTNVDMSLAQR
DIVVQETMEDIDK
NGMFFSTYDR
GTAGNALMDGASQL
proteins
MS proteomics: peptide IDs and protein IDs
peptides
proteins
IPI00302927
IPI00025512
IPI00002478
IPI00185600
IPI00014537
IPI00298497
IPI00329236
IPI00002232
TDMDNQIVVSDYAQ
MDRTW
LFDQAFGLPR
AKPLMELIER
DESTNVDMSLAQR
DIVVQETMEDIDK
NGMFFSTYDR
GTAGNALMDGASQL
Protein Inference is complex!!
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
PROTEIN INFERENCE
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Intermezzo: Protein inference
The minimal and maximal explanatory sets
Minimal set
Occam
{
peptide
a
proteins
prot X
prot Y
prot Z
x
x
b
c
d
x
x
x
x
c
d
The Truth
Maximal set
anti-Occam
Juan A. Vizcaíno
juan@ebi.ac.uk
{
peptide
a
proteins
prot X
prot Y
prot Z
x
x
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
b
x
x
x
x
Intermezzo: Protein inference
Slide from J. Cottrell, Matrix Science
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Protein inference
A
B
C
D
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Protein inference
A
B
C
D
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Protein inference
A
B
C
D
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Protein inference
A
B
C
D
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Protein inference
A
B
C
D
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Protein inference
A
B
C
D
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Protein inference
A
B
C
D
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Protein inference
A
B
C
D
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Protein inference
A
B
C
D
Unambiguous
peptide
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
OTHER APPROACHES TO PERFORM
MS/MS IDENTIFICATION
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Three types of MS/MS identification
Protein database based comparison
database
theoretical
spectrum
sequence
compare
experimental
spectrum
Sequential comparison: de novo approaches
database
compare
sequence
de novo
sequence
experimental
spectrum
Spectral comparison
Spectral
library
experimental
spectrum
compare
experimental
spectrum
Modified From: Eidhammer, Flikka, Martens, Mikalsen – Wiley 2007
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
De novo approaches
Example of a manual de novo of an MS/MS spectrum
No more database necessary to extract a sequence!
Juan A. Vizcaíno
juan@ebi.ac.uk
Algorithms
References
Lutefisk
Sherenga
PEAKS
PepNovo
…
Dancik 1999, Taylor 2000
Fernandez-de-Cossio 2000
Ma 2003, Zhang 2004
Frank 2005, Grossmann 2005
…
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Three types of MS/MS identification
Protein database based comparison
database
theoretical
spectrum
sequence
compare
experimental
spectrum
Sequential comparison: de novo approaches
database
compare
sequence
de novo
sequence
experimental
spectrum
Spectral comparison
Spectral
library
experimental
spectrum
compare
experimental
spectrum
Modified From: Eidhammer, Flikka, Martens, Mikalsen – Wiley 2007
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Spectral searching
• Concept: To compare experimental spectra to other
experimental spectra.
• There are many spectral libraries publicly available (for
instance, from NIST)
• Custom ‘search engines’ have been developed:
• SpectraST (TPP)
• X!Hunter (GPM)
• It has been claimed that the searches have more
sensitivity that with sequence database approaches
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Spectral searching (2)
http://peptide.nist.gov/
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
COMBINING DIFFERENT SEARCH
APPROACHES
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Multi-stage peptide identification strategy
Goal: “Squeeze” your
good quality
experimental spectra
Taken from Nesvizhskii, J Proteomics, 2010
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
PROTEIN SEQUENCE DATABASES
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
What is needed from a protein database
1. Comprehensive (whatever is not in the DB will not be
included in your results).
2. Not too redundant at the protein sequence level
- Protein inference gets easier
- It is not very good if the database is too big.
3. Quality of annotation
4. Stability of identifiers
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Main databases used
a) UniProt Knowledgebase (UniProtKB): SWISS-PROT (manually
curated)/ TrEMBL.
b) NCBI non-redundant database: It compiles all protein
sequences available from the following databases: ‘GenBank’
translations, the Protein Data Bank (PDB), UniProtKB/SwissProt, PIR and PRF.
c) Ensembl: Genomics centric resource. Integration of the
information with genomics is easy.
d) IPI (International Protein Index): It has been discontinued
(9/2012). Different builds for different species (Human, Mouse,
Cow, Rat, Zebrafish, Dog, Arabidopsis).
a) Model organisms DBs (for instance, TAIR for Arabidopsis).
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Databases for non-model organisms
- If the species is not well represented in the protein databases,
there is a much stronger need to search ESTs or genomic
databases.
-The search engine will translate the 6 possible ORFs for each
nucleotide sequence.
- ESTs are not suitable for PMF approaches (incomplete proteins).
- The alternative is to filter comprehensive databases like UniProt by
species or genus, or to use a protein DB from a close organism.
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Importance of choosing the right DB
-Since each database has a different focus, the
databases can vary in terms of completeness, degree of
redundancy, and quality of annotations.
-More inclusive bigger protein databases will take longer
to search
- For the bigger resources, it may also result on more
false-positive identifications and reduced statistical
significance (the probability of random match is higher).
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
POST-VALIDATION OF RESULTS
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Other concepts that would be nice to learn…
-Concepts of peptide and protein FDR
-Decoy databases
- Softwares like PeptideProphet, ProteinProphet, …
-Influence of PTMs in the search
-Scoring of PTM positioning
…..
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Recommended reading….
Nesvizhskii, J Proteomics, 2010
and many more…
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Conclusions
• Approaches to perform peptide and protein identification
• Sequence database based approaches: search engines
• The protein inference problem
• Importance of choosing the right protein database
• Many things to be learnt…
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Remember: it should not be a black box…
From: Lilley et al., Proteomics, 2011
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
And still… we haven’t touched quantification at all
From: Vaudel et al., Proteomics, 2010
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Questions?
Juan A. Vizcaíno
juan@ebi.ac.uk
EBI Bulgaria Roadshow
Rotterdam, 12 June 2012
Download