Lecture Note 1 - Instructional Web Server

advertisement
Biochem 523b: Advanced Physical Methods:
Mass Spectrometry, X-ray Crystallography and NMR
1. Biological Mass Spectrometry (Lajoie)
(3 +1 lectures)
2. X-ray Crystallography (Ling)
(3 +1 lectures)
3. NMR (Shaw)
(3 +1 lectures)
Biochem 523b: Advanced Physical Methods:
Mass Spectrometry, X-ray Crystallography and NMR
First lecture: January 11(room DSB 3008) 2:30-5:30pm
Last Lecture: March 29
Final Exam: TBA, Mid-April
Reference material: Course notes and journal articles.
Evaluation:
Presentations (2) 30 marks
Assignments (3) 30 marks
Final exam 40 marks
Students will give a 20 min presentation in two of the three
topics discussed in the course.
There will be an assignment for each topic. The final exam will
be a 3hr exam with questions in each section.
Note: Course notes from Biochem 440 and 465a will be available for review
Biochem 523b: Advanced Physical Methods:
Mass Spectrometry, X-ray Crystallography and NMR
A. Mass Spectrometry
Lecture 1
Introduction
Definitions
Basic concepts
Mass Spectrometer
Ionization
MALDI
ESI
Multiply charged ions and deconvolution
MS/MS sequencing
Lecture 2
Mass analyzers
Biochem 523b: Advanced Physical Methods:
Mass Spectrometry, X-ray Crystallography and NMR
Lecture 3
Quantitation
General Principles
ICAT, iTRAQ, SILAC,etc.
PTMs
Phosphorylation
Glycosylation
Lecture 4 Short presentations by students
Non-Covalent studies
Protein Folding
HTP Proteomics
Metabolomics
Other PTMs
Etc
Mass Spectrometry
• An instrument that measures the masses of individual molecules
that have been converted into ions, i.e., molecules that
have been electrically charged. Measure mass of ions not of neutral
molecules.
• Since molecules are so small, it is not convenient to measure
their masses is kilograms, or grams, or pounds.
In fact, the mass of a single hydrogen atom is approximately
1.6726 X 10-24 grams.
The convenient unit of mass is often referred to by chemists and
biochemists as the dalton (Da) and is defined as follows: 1 Da=(1/12)
of the mass of a single atom of the isotope of carbon-12(12C).
This follows the accepted convention of defining the 12C isotope as
having exactly 12 mass units 12C= 12.00000.
Dalton (Da) is also known as unified atomic mass unit (u or amu)
1 Da = 1 u = 1.66055 x 10-27 Kg
Why Mass Spectrometry?
Highly selective: Can monitor a certain analyte with minimum
interference by other species in the sample
The high selectivity also to distinguish several species at the same time
for measure multiple species at the same time with high resolution for
each as opposed to average signal for many spectroscopic techniques
such as UV absorbtion, fluorescence, etc. Selectivity can be increased
by coupling with separation technique such as LC or GC
MS is highly sensitive or has very low limit of detection picomole
(10-12 mol) to zeptomole (10-21 mol). Much more sensitive than
NMR
Concentration in NMR is typically mM
MS can handle nM or better
Use of Mass Spectrometry
• Identify structures of biomolecules such as proteins,
carbohydrates, nucleic acids and steroids
• Sequence biopolymers: proteins and oligosaccharides
• Determine how drugs are used by the body (metabolites)
• Perform forensic analyses: confirmation and quantitation
of drugs of abuse
• Analyze for environmental pollutants
• Determine the age and origins of specimens in geochemistry
and archaeology
• Identify and quantitate compounds of complex organic
mixtures
Applications in Biochemistry
•Characterization of biomolecules including proteins and their
modifications (phospho, glyco)
•Sequence determination of proteins (peptides), polysaccahrides,
lipids
•Protein–Ligands Interactions
•Protein abundance
•Protein-Protein interactions (network)
•Stoichiometry of complexes (quaternary structure, metal, etc.)
•Cellular localization (organellar proteomics)
•Protein dynamics and folding
•Proteomics, Glycomics, Lipidomics, Metabolomics (HTP of above)
The Mass Spectrometer
High Vacuum
Sample
Ion Source
Creates ions in
the gas phase
Electron Impact (EI)
Chemical Ionization (CI)
Fast atom bombardment
(FAB)
MALDI
Electrospray
Mass Analyzer
Detector
Separates ions in
space or time
according to mass
to charge ratio m/z
Collects ions
and amplifies
signals
Magnetic sector
Time-of-flight (TOF)
Quadrupole (Q)
Hybrid : Q-TOF
Linear Ion Trap
FT ICR
Orbitrap
Data System
Stores and
analyzes data
Controls
the mass
spectrometer
Why high vacuum in mass analyzer?
High vacuum = low pressure
High vacuum is necessary to minimize collisions with other
gazeous molecules. Collisions would produce deviation of the
trajectory and ions would loose their charges on walls of instrument.
Ion-molecules could produced unwanted reactions and increase
complexity of spectrum (controlled collisions can be useful and will
be discussed later.
The average distance a particle can travel before colliding is
called the mean free path L:
L = kT/
r1
r2
2 ps
k = Boltzmann constant, T is temperature in K
p = pressure in Pa and s is collision cross-section (m2)
s = p d2 where d is sum of radii of the stationary molecule
and colliding molecule
d = r1 + r2
K = 1.38 x 10-21 J K-1, T ~ 300 K, s ~ 45 x 10-20 m2
L (cm) = 0.66/ p (Pa) or = 4.95/ p (milliTorr)
Why high vacuum in mass analyzer?...
The mean free path L must be larger than the dimension of the mass analyzer
In a typical mass spec the mean free path should be at least 1 meter
and hence the maximum pressure should be no more than 66 nbar
However we need L = 10 to 100 times the free ion path to reduce
The probability of ion/neutral collision to 10% or better to 1% or less
Typical vacuum in MS systems is 10-5to a10-10 Torr. For air molecule at
10-7 Torr, L is ~1 km
Note: Measurement of cross section can yield information on the
conformation of molecules in the gas phase
Useful Definitions and Units
Prefix for SI units
10-1
10-3
10-6
10-9
10-12
10-15
10-18
10-21
deci
milli
micro
nano
pico
femto
atto
zepto
Quantities
Charge of electron
Mass of the electron
Mass of the proton
Mass of the neutron
Unified of atomic mass
Avogrado constant
Pressure
1 pascal (Pa) = 1 Newton (N) m-2
1 bar = 10 6 dyn cm-2 =105 Pa
1 millibar (mbar) = 10-3 bar = 102 Pa
1 atmosphere (Atm) = 1.1013 bar = 101 308 Pa
1 Torr = 1 mmHg = 1.33 mbar = 133.3 Pa
1 psi = 1 pound per square inch = 0.07 atm
e
me
mp
mn
u
NA
1.60219 x 10-19 C
9.10953 x 10-31 kg
1.67265 x 10-27 kg
1.67495 x 10-27 kg
2.99793 x 10-27 kg
6.02205 x 1023 mol-1
Energy
1 cal = 4.184 J
eV = 1.602 x 10-19 J
Mass Spectrum
2 dimensional representation of signal intensity (y axis) vs m/z (x axis)
The intensity reflects the abundance of ionic species
intensity
100
Most intense peak is called the base peak
and is most often normalized to 100% relative
Intensity. Plot centroid peak
50
50
100
150
200
m/z
m/z is dimensionless, z = is an integer, 1 or more
Back to Basics…
Chemical Composition of Living Matter
27 of 92 natural elements are essential.
Elements in biomolecules (organic matter):
H, C, N, O, P, S
These elements represent approximately 92% of
dry weight.
Organic Matter
Organized in "building blocks"
amino acids
polypeptides ( proteins)
monosaccharides
starch, glycogen
nucleic acids
DNA, RNA
Mass (Weights) of Atoms and Molecules
Element
C
H
O
N
S
P
Nominal
Exact
Percent
mass
mass
abundance
12
12.00000
98.90%
13
13.00335
1.10%
1
1.00783
99.986%
2
2.01410
0.015%
16
17
15.99491
16.9991
99.762%
0.038%
18
17.9992
0.2%
14
14.00307
99.63%
15
15.00011
0.37%
32
31.9721
95.02%
33
32.9714
0.75%
34
33.9678
4.21%
36
35.9671
0.02%
31
30.9737
100%
Average
mass
12.0115
1.00797
15.9994
14.0067
32.066
30.9737
Calculation of Atomic and Molecular mass
Nominal Mass: To calculate the approximate mass of a molecule
Use the mass of the element present eg CO2 12 + (2x16)= 44; not precise
but sufficient in many applications
Isotopic mass: is calculated from the exact mass of the isotopes. It is close
but not equal to the nominal mass. The monoisotopic mass of a molecule
is the addition of the exact mass of the most abundant isotopes for each atom
Present. For CO2 12.00000 u + (2 x 15.994915) u = 43.989830 (u or Da)
Exact ionic mass: Depends on how the ions are formed. For CO2+.
12.0000000u + 2 x (15.994915) – 0.000548 u (mass of e) = 43.989282.
For ESI or MALDI in positive ion mode, we will add the mass of
one or more proton.
Relative Atomic Mass (average mass): calculated from the weighted
average of naturally occurring isotopes of an element. The relative molecular
Mass Mr is calculated from the relative atomic masses of the elements in the
empirical formula. Eg CO2 12.0108 + 2 x 15.9994 = 43.9988
Mass spectrum
A mass spectrum is a graph of ion intensity as a function of mass-to-charge
ratio. Mass spectra are often depicted as simple histograms as shown
Most abundant =100%
relative Intensity
(number of ions counted)
Mass Spectrometry
+
(CH3) 3N-CH2-CH2-OH
104
(Choline ion)
5 C (12) = 60
14 H (1) = 14
1 O (16) = 16
1 N (14) = 14
--------104
50
150
100
m/z
Low resolution mass spectrum
Formation of Ions by Electron Ionization
Removal of 1 electron
Mass or Molecular Weight of Molecules
Ethyl acetate
C4H8O2
4 C12
8 H1
2 O16
Nominal Mass:
4 x 12.0000
8 x 1.00783
2 x 15.9949
48 + 8 + 32 =
48.0000
8.06264
31.98982
88
Monoisotopic Mass:
88.0546
Average Mass: 48.046 + 8.06376 + 31.988 =
88.10856
Mass Spectrum of Ethyl Acetate by Electron Impact (EI)
..O.+
Harsh ionization causes fragmentation
+
CH3
+
H3C C
H3C
m/z = 43
m/z = 15
%relative intensity
O
.. CH CH
O
..
2
m/z = 88
43.02
(100%)
Monoisotopic peaks
Base peak
88.05 (10%)
15 (1%)
20
44.02 (2.2%)
40
89.05 (0.44%)
80
m/z
3
Approximation of Isotopic Distribution
Ethyl acetate
C4H8O2
1st PEAK (100%intensity)
4 C12
4 x 12.0000
8 H1
8 x 1.0078
2 O16
2 x 15.99949
Second peak (4.56 % intensity)
3 C12
3 x 12.0000
1 C13
1 x 13.000333
8 H1
8 x 1.0078
2 O16
2 x 15.99949
(1.1% x 4 = 4.4%)
4 C12
7 H1
1 H2
2 O16
4 x 12.0000
7 x 1.0078
1 x 2.0140
2 x 15.99949
(0.020 x 8 = 0.16%)
48.0000
8.0624
31.9898
88.0522
36.0000
13.0335
8.0624
31.9898
89.055
48.0000
7.0546
2.0140
31.9898
89.0584
Amino Acids (20)
Intact nominal mass
R = H,
R
H2N
H
CO2H
Glycine (Gly, G)
R = CH3 ,
C2 H5NO2
MW 75
C3H7NO2
MW 89
Alanine (Ala, A)
R = CH2 CO2H,
R = (CH2)4-NH2
Aspartic (Asp, D) C4H7NO4
MW 133
Lysine (Lys, K)
C6H14N2O2 MW 146
Exact Mass of Amino Acid Residues in Proteins
Gly
Ala
G
A
57.02150
71.03720
Gln
Lys
Glu
Q
K
E
128.05860
128.09500
129.04270
Note: Leu (L) = Ile (I) = 113.08410
Amino Acids and Proteins Have
Mass (or Weight)
Ser
Ala
H
N
H
CH3
H
O
C
+
H
N
H
N
CH2OH
H
O
C
CH3
H
C
H
O
+
H
N
H
OH
Phe
OH
CH2OH
H
H
H
C
N
N
O
H
CH2
H O
C
CH2
H
O
C
OH
+ 2 HO
2
OH
Ala-Ser-Phe (ASF)
Nominal (MW 89 + 106 + 165 - (2 x 18)) = 323
or C15H21N3O5
monoisotopic mass:
71.03711 + 87.03203 +147.0684 + 18.0105 (H2O) = 323.1481
Mr
average mass 323.3490
Mass accuracy and resolution
Mass accuracy: the difference between measured and accurate mass
and calculated exact mass. Mass accuracy can be stated as absolute units
of u (or mmu) or as relative mass accuracy in ppm (most common):
(Experimental – Calculated) (106)
= ppm
Calculated
(0.0406) (106)
(3708.99 -3708.9494) (106)
= 11 ppm
=
3708.9494
3708.9494
Resolution: good mass accuracy can only be obtained from sharp peaks
that are evenly shaped signals that are well separated form each other
Resolution and mass accuracy
Resolution (R) is a measure of separation between two adjacent
peaks (masses). Dm is the smallest mass difference at which two
masses can be resolved. R = m/Dm
Resolving power (R) is also a performance characteristics
of MS instruments, that is its ability to distinguish between two
Ions that differ only slightly in their m/z rario
There are a number of ways to describe resolution (R):
• Peak width at 10% valley for two overlapping peaks (2x 5%)
• Peak width at 5% maximum for a single peak
• Peak full width at half maximum (FWHM) (most common)
ie in Da at 50% of the intensity
Since resolution is also related to peak width, resolution will also
affect mass accuracy. On most instruments higher resolution
means lower sensitivity.
Resolution and mass accuracy
Two overlapping peaks
2 peaks at 10% valley
Single peak
Full Width at Half Maximum
(FWHM)
Consequences of resolution on mass accuracy
1u
1u
1u
1u
0.1u
50
51
500
501
1000
1001
Signals at m/z 50, 500 and 100 at R = 500. At m/z 100 the peak maxima
are shifted towards each other due to superimposing of the peaks.
Importance of Resolution
Glucagon: Monoisotopic and Average Mass
As the mass increase the monoisotopic peak is less and less evident
First peak C153 H225N42 O50S
100%
Second Peak:
12C-13C
1H-2H(D)
14N- 15N
153 x 1.1%
225 x 0.02%
42 x 0.37%
Monoisotopic mass: 3,482.61
Average mass:
3,484.75
170%
4.5%
15.5%
190%
*
Note: Peaks of highest intensity is 1 Da higher than monoisotopic
for each ~1500 Da (ie for mass ~3000 the highest peak is 2 Da
higher than the monoisotopic peak
Resolution and mass accuracy…
Mass accuracy:
ppm = 106 /R = 106 Dm/M
Example: Measure a mass at 1,000 +/- 0.5 Da
Mass accuracy = 106 (0.5)/ 1,000 = 500 ppm
Resolution R = M/Dm = 1,000/0.5 = 2,000
• Higher resolution gives higher mass accuracy
• For a given resolution mass accuracy decrease with higher m/z
m/z
1,000
2,000
10,000
10,000
10,000
Dm (+/-) Resolution
0.05
20,000
0.05
40,000
0.5
20,000
0.05
200,000
0.005
2,000,000
ppm (+/-)
50
25
50
5
0.5
Mass range
999.95-1000.05
1999.95-2000.05
9999.5-10,000.5
9999.95-10,000.05
9999.995-10,000.005
Characteristics of Mass Spectrometers
- Sensitivity: expressed in lowest detection limit
eg picomolar (10-12 mole), now subfemtomolar (< 10-15)
- Mass range eg 50-4000
- Mass accuracy expressed in u or ppm (best 1- 5 ppm)
- Resolving power: ability to separate two peaks (masses)
For R = 20,000 can see two masses at 100.000
and 100.005
dynamic range: ability to observe two peaks at very
different intensities eg 1000:1 (103)- best 104 (LTQ-FTMS)
-
-others: cost, ease of operation, etc.
Characteristics of Some Mass Spectrometers
- Sensitivity for tryptic peptides
MALDI–TOF/R
10-100 x 10-15mole
Q-TOF2
50-200 x 10-15mole
-Resolution
MALDI -TOF/R
Q-TOF2
FTMS
10,000 at mass 2,000
10-15,0000
100,000-3,000,000
- Mass Accuracy
MALDI-TOF/R: external calibration
internal calibration
Q-TOF2:
external calibration
FTMS
+/- 50 ppm
+/- 20 ppm
+/- 50 ppm
+/- 1-10 ppm
MALDI-TOF
Q-TOF
Ion Trap
FTICR (9.4T)
Sensitivity
Highest
High
Medium
High
Mass
Accuracy
Narrow
range
High
Poor
Highest
Sequencing
(MS/MS)
Difficult
Yes
Yes
Yes
Throughput
High
Med
Med
Med
Ease of
operation
Easiest
Med
Med
Hardest
Cost
300K
650K
300K
1.0M
Newest: MALDI -TOF/TOF; MALDI- Qq-TOF; FTICR MS 12 Tesla
MALDI-ion trap/quadrupole; ESI Quad/Trap/TOF, Orbitrap
Matrix Assisted Laser Desorption/Ionization (MADLI)
1. Matrix containing analytes (eg proteins) absorbs
UV (or IR) energy from a pulse laser (10 nanoseconds)
2. The matrix ionizes and dissociates; it undergoes a
phase change to a supercompressed gas; it then
transfers it charges to the analyte molecules
3. Matrix expands at supersonic velocity; additional
analytes are formed in the gas phase; the resulting
ions are entrained in the expanding plume
4. The analyte ions are accelerated by a voltage pulse
and analyze in the mass spectrometer
Matrix Assisted Laser Desorption/Ionization
Sample is co-crystallized with matrix (solid)
Formation of singly charged ions
Koichi Tanaka, Nobel Prize 2002
MALDI-TOF/R MS of Peptides from a Tryptic Digest
2790.22
100
2791.23
1324.60
2789.22
1325.62
Peptides from trypsin
self-digestion
2792.23
%
internal calibrants
2466.18
1265.62
2465.20 2467.19
1326.60
2793.23
1759.93
1974.94
1760.93
2468.20
1975.93
1179.41
0
1000
1748.86
1477.62
1478.61
1761.92
1540.63
1327.61
1460.59
1976.92
2356.10
2355.11
2179.87
2794.20
2469.17
2746.23
2795.06
3104.41
3103.43
3106.42
m/z
1200
1400
1600
1800
2000
2200
2400
2600
2800
Mass “Fingerprint” of a Pure Protein
3000
Protein Identification with MALDI-TOF/R
1. Cut spots from 2D Gel, destained, reduce disulfide bonds,
alkylate with iodoacetamide and trypsin digestion of
each spot (medium to high silver stained spot)
2. Extract peptides and purify by ZipTip, containing reverse
phase or by capillary HPLC.
3. Mix with matrix and analyze by MALDI-TOF/R
4. Compare observed masses with masses in databases
obtained from virtual tryptic digest of all proteins
(mass fingerprinting)
5. Confidence for hits depends on coverage: minimum 5
masses (should get >30% coverage)
Proteomic Analysis with MALDI: Mass Fingerprinting
“Bottom-up Approach”
Peptides to proteins
1000
1500
2000
Mass (m/z)
Pick spots on a gel
Protein(s) in
solution
Extract peptides;
mass analyze
Digest – site
specific protease
Database search or
sequence
Difficulties With Mass Fingerprints
Many tryptic peptides have similar masses
resulting in numerous false positives.
Mass accuracy is critical !!
Mass from 1000.30-1000.70
Typical Problems
1. No MS signals!!
Insufficient sample (poor digestion, poor extraction)
Contaminants that affect ionization: SDS, acrylamide,
salts, detergents, PEG
2. Protein contamination
Keratins, peptides from trypsin self-digestion,
bacterial proteins, etc..
3. Detect the most abundant proteins only
4. Masses affected by PTMs, adducts, etc, wrong
assignment
Electrospray Ionization –MS
ESI MS
Formation of Charged Droplets and
Mutilply Charged Ions
Formation of multiply charged ions
Mass Spectrum of a Multiply Charged Protein
Raw Data Spectrum for Myoglobin
(Denaturing conditions)
myo 12 (0.467) Sb (2,10.00 ); Cm (5:20)
A20
848.55
100
TOF MS ES+
3.24e3
848.58
A18
942.73
A:
+18
+17
+16
+15
A22
771.51
A17
998.12
Maximum number of charges
is dependent on number of
basic residues (Lys, Arg, His)
A16
1060.51
A15
1131.08
A23
738.04
%
16951.50±0.02
A14
1211.77
A24
707.30
+14
1211.92
1231.38
A13
1305.02
1232.37
+13
A25
679.06
1413.56
948.18 1003.92 1066.52
+12
1137.65
A11
1542.01
1233.39
+11
1249.42 1312.52
1421.81
1098.93
1550.99
1172.13
1696.15
+10
1884.66
1705.87
+9
1895.38
1352.30
1464.88
1847.54
0
m/z
700
800
900
1000
1100
1200
1300
1400
1500
1600
1700
1800
1900
Deconvoluted Spectrum for Myoglobin
myo 12 (0.467) M1 [Ev-145428,It12] (Gs,0.700,650:1950,1.00,L33,R33); Sb (25,10.00 ); Cm (5:20)
TOF MS ES+
2.30e5
A
100
16951.85
A:
16951.50±0.02
Myoglobin: Deconvoluted Spectrum
Expected MW = 16951.49 Da
R = 8000
%
17049.71
17567.41
0
14000
mass
14500
15000
15500
16000
16500
17000
17500
18000
18500
19000
19500
MS of Glu-fibrinopeptide
25-MAR-2002
gfmar25b 399 (9.653) Sm (SG, 2x4.00); Cm (397:403)
100
1: TOF MS Survey ES+
2.59e3
A2
785.85
A:
+2
1569.70±0.00
786.36
M= (785.85 x 2) -2H = 1569.79
%
786.86
+3
A3
524.24
524.58
542.22
787.38
776.87
542.56
0
400
787.88
m/z
500
600
700
800
900
1000
1100
1200
1300
1400
1500
MS of Glu-fibrinopeptide: doubly charged ion
785.85
25-MAR-2002
gfmar25b 399 (9.653) Sm (SG, 2x4.00); Cm (397:403)
A2
100
785.85
1: TOF MS Survey ES+
2.59e3
A:
1569.70±0.00
786.36
786.36
M= (785.85 x 2) -2H = 1569.79
0.5 Da
Monoisotopic
786.86
%
786.86
0.5 Da
787.38
787.38
0.5 Da
787.88
0
785
786
787
788
m/z
789
Transform of Different Charged States to MW
Each peak is related to the mass (m) and charge state (z)
m/z1 = ( MW + n1 mH ) / n1
m/z2 = [ MW + ( n2 mH )] /n2
Each adjacent for a pure molecule is related ie n2 = n1 +1, that is one
proton and one more charge
Calculation of MW (M) of Proteins from ESI Data
For two adjacent peaks of m/z: m/z1 (higher value) and m/z2 (lower value)
the number of charges (n) will differ by one; mH= mass of H+ (1.0079)
n2 = n1 + 1
m/z1 = (M + n1 mH) / n1
m/z2 = (M + n2 mH) / n2
m/z2 = (M_____________
+ (n1 + 1) mH)
n1+1
M = m/z1 n1 – n1mH = m/z2 (n1+1) – ((n1+ 1) mH)
n1 = _______
m/z2- mH
m/z1- m/z2
M = n1 (m/z1- mH)
Once we know the charge sate n1
we can calculate M
ESI MS of bovine insulin
ESI MS of Bovine Insulin
Bovine insulin peaks at 2867.2 and adjacent
peaks at 1912.3, 1434.3, 1147.4
n1 = m/z2- mH / m/z1- m/z2
n1 = (1912.3 – 1) / (2867.2 – 1912.3) = 1911.3 /954.9 = 2.00
M = n1 ( m/z1- mH)
M = 2.00 X (2867.2-1.0079) = 5732.4
Repeat for 1434.3, M = 5733.9
1147.4, M = 5733.2
and average:
Mexp = 5732.9 +/- 0.7
Mr = Calculated average mass = 5733.58
Verification of Mutant Proteins
Yeast Iso-1 cytochrome c
Y67F, N53I (+heme)
Cald: 12688
Obsd:12687
Confirmation of Sequence : Glyoxylase 1
Predicted from cDNA: 14906
Observed: 14919 +/- 1
D : 13 Da?!
ACC to AAC
Thr to Asn
Effects of Denaturation on Charge Distribution
Denaturation by heat, pH,
organic solvents
2 distributions of charge states
Non-Covalent Complex:Calmodulin + 4 Ca++ + MLCK
pH = 6.7, 40 C, 5% MeOH
KRRWKKNFIAVSAANRFKKISS
2634
Calmodulin: 16700
4 Ca++ :
160
Cald: 19486
Obsd: 19484
Effect of pH on Hemoglobin Tetramer
D = Dimer
Q = Tetramer
ESMS and Tandem MS (MS/MS) of Peptides
or
Mutliply Charged
MS/MS spectrum
Protein Identification by MS/MS of Peptides
y5
y4
x4
z4
O
H
H2N
H
C
C
H
N
c1
b1
y5
H
C
C
a2
H
N
R1
c2
b2
a3
N
H
C
y2
z2
C
C
H
C
H
N
c3
b3
R3
b2
O
H
C
C
H
N
a4
OH
y1
O
O
H
N
C
a5
c4
b4
H
C
C
H
N
H
C
R5
R4
b3
H
C
R5
R4
C
z1
O
O
H
N
y1
x1
y2
O
R2
b1
H
C
y3
O
C
x2
R3
y4
H
C
z3
O
R2
a1
H2N
y3
O
R1
H
x3
b4
a5
C
OH
“Mobile” proton causes cleavage along peptide chain
CH3
O
H+
NH
O
NH
NH
NH2
O
CH2OH
CH(CH3)OH O
CH3
NH
O
H+
O
NH
H+
CH(CH3)OH O
O
CH3
O
NH
NH
O
OH
+
(CH2)4-NH2 H
CH2(C6H5)
NH
NH
CH2(C6H5) O
CH(CH3)OH O
O
+
(CH2)4-NH2 H
CH2(C6H5) O
O
OH
NH
NH
CH2OH
NH
NH2
NH
CH2(C6H5)
O
NH
O
CH2(C6H5)
NH
CH2OH
NH
NH2
O
CH2(C6H5) O
CH(CH3)OH O
CH3
H+
Migration of the mobile proton
NH
NH
NH2
OH
(CH2)4-NH2
CH2(C6H5) O
CH2OH
O
NH
NH
Doubly Protonated ATSFYK
O
CH2(C6H5)
O
OH
+
(CH2)4-NH2 H
H+
H+
CH3
O
NH
NH
NH2
O
CH2OH
O
NH
CH(CH3)OH O
CH2(C6H5)
NH
O
O
NH
NH
OH
+
(CH2)4-NH2 H
CH2(C6H5) O
CH(CH3)OH O
CH3
O
NH
NH
NH2
O
CH2OH
CH2(C6H5)
NH
CH2(C6H5) O
NH
O
OH
+
(CH2)4-NH2 H
H+
Each protonation site can induce different fragmentation
Formation of y ions:
CH3
O
NH
NH
NH
NH2
H+
CH2OH
CH3
H
+
NH
N
NH
NH2
O
O
OH
OH
+
(CH2)4-NH2 H
CH2(C6H5) O
O
NH
NH
CH2(C6H5)
CH2OH
H
+
NH
N
NH
NH2
O
O
OH
(CH2)4 NH2
O
H+
OH
CH2(C6H5) O
O
NH
NH
CH2(C6H5)
OH
O
CH(CH3)OH
H+
NH
CH2OH
CH3
CH2(C6H5) O
O
NH
NH
NH
CH2(C6H5)
N
O
OH
(CH2)4 NH2
H+
O
O
O
NH
NH
CH(CH3)OH
CH3
NH2
O
CH2(C6H5)
CH2(C6H5) O
CH(CH3)OH O
O
O
CH2OH
CH(CH3)OH
y3 ion (doubly charged)
Formation of b ions:
y3
O
H2N
H
C
C
O:
H
N
H
C
R1
O
H+
N
H
C
R2
H
C
C
O
O
H
C
H
N
R3
C
H
N
H
C
C
OH
R5
R4
b2
O
H2N
H
C
C
H
N
H
C
R1
+
C O
R2
b2
-C O
H2N
R3
(-28)
C
O
O
O
H
C
H
N
H
C
R4
C
H
N
H
C
C
OH
R5
Neutral
O
H2N
H
C
R1
C
H
N
+
C
H
R2
a2
b2 ions are often observed with a diagnostic -28 a2 ion; b1 ions are rare
b2 ion allows you to determine yn-2 ion, since M + 2 = b2 + yn-2
y and b Ions from Peptide DAEFR
y ions:
1
H
115.1 71.1
Ala
Asp
129.1
Glu
147.2
Phe
156.2
Arg
17
OH
+ H+
115.1+ 1 + 71.1 + 129.1 + 147.2 + 156.2 + 17 + 1
m/z = 637.7
71.1 + 1+ 129.1+ 147.2 + 156.2 + 17 + 1
129.1 + 1 + 147.2 + 156.2 + 17 + 1
147.2 + 1+ 156.2 + 17 + 1
m/z = 522.6
m/z = 451.5
156.2 + 1 +17 + 1
m/z = 175.2
m/z = 322.4
b ions:
1
H
115.1 71.1
Ala
Asp
129.1
Glu
147.2
Phe
156.2
Arg
17
OH
+ H+
1 + 115.1 + 71.1 + 129.1 + 147.2 + 156.2
m/z = 619.7
1 + 115.1 + 71.1 + 129.1 + 147.2
m/z = 463.5
1 + 115.1 + 71.1 + 129.1
m/z = 316.3
1 + 115.1 + 71.1
m/z = 187.2
1 + 115.1
m/z = 116.1
Solving MS/MS Spectra
Mass difference between b1 and b2, b2 and b3 or between
y1 and y2, etc, gives mass corresponding to aa.
For tryptic digests (K, R), the first amino acid at the C- terminal
is known, ie K or R.
(note R gives stronger signals than K by either ESI or MALDI)
Immonium ions are often observed and give information on types of
amino acids present in sequence. (not observed with ion trap MS)
+
H2N
C H
R1
Lys 101
Agr 129
Ser 60
Glu 102
Trp 159
Pro 70
Val 72
Gln 101
Thr 74
His 110
Met 104 Phe 120
Tyr 136 Cys 76
Asp 88 Leu 86
Asn 87 Gly 30
Ala 44
Cys 76
Ile 86
Idealized Product Ion Spectrum of Tryptic Peptides
M + 2 = b2 + yn-2
100%
a2 y1 b2
b3
y2
b4
y3
y4 b5
y5
y5 = M +1
Immonium
ions
200
300
m/z
28
100
400
500
MS of Glu-Fibrinopeptide
Select doubly charged ion in MS
gfmar25b 399 (9.653) Sm (SG, 2x4.00); Cm (397:403)
100
25-MAR-2002
1: TOF MS Survey ES+
2.59e3
+2
A2
785.85
A:
1569.70±0.00
786.36
25-MAR-2002
gfmar25b 399 (9.653) Sm (SG, 2x4.00); Cm (397:403)
A2
100
785.85
1: TOF MS Survey ES+
2.59e3
A:
1569.70±0.00
786.36
%
%
786.86
786.86
787.38
+3
787.88
0
785
A3
524.24
524.58
542.22
786
787
m/z
789
788
787.38
776.87
542.56
0
400
787.88
m/z
500
600
700
800
900
1000
1100
1200
EGVNDNEEGFFSAR
1300
1400
1500
Sequencing Glu-fibrinopeptide (Q-TOF)
R
(1395.47)
R
(1395.47)
684.36
100
813.40
480.27
333.20
%
942.44
187.08
1285.55
1056.47
1171.50
497.21
175.12 246.16
y1
bMax
yMax
1570.72(M+H) +
627.33 740.28
382.19
612.23
924.46
1057.57
1286.66
1384.63
169.07
1535.67
1571.60
1574.74
0
M/z
100
200
300
400
500
600
700
800
900
1000
1100
1200
1300
1400
1500
•m/z 175 = C-terminal Arg, m/z 147 = C-terminal Lys (y ion series)
•Can start sequencing from anywhere
MS/MS of Glu Fibrinopeptide
246 – 175 = 71, residue molecular weight Ala = 71!
(1324.43)
R
A
A
684.36
100
%
942.44
187.08
246.16
y2
1285.55
1056.47
1171.50
497.21
175.12
y1
bMax
yMax
813.40
480.27
333.20
R
(1324.43)
1570.72(M+H) +
627.33 740.28
382.19
612.23
924.46
1057.57
1286.66
1384.63
169.07
1535.67
1571.60
1574.74
0
M/z
100
200
300
400
500
600
700
800
900
1000
1100
1200
1300
1400
1500
MS/MS of Glu Fibrinopeptide
333 – 246 = 87, residue molecular weight Ser = 87!
(1237.40)
R
A
S
S
684.36
100
333.20
y3
942.44
246.16
y2
bMax
yMax
1285.55
1056.47
1171.50
497.21
175.12
y1
R
813.40
480.27
%
187.08
A
(1237.40)
1570.72(M+H) +
627.33 740.28
382.19
612.23
924.46
1535.67
1057.57
1286.66
1384.63
169.07
1571.60
1574.74
0
M/z
100
200
300
400
500
600
700
800
900
1000
1100
1200
1300
1400
1500
MS/MS of Glu Fib Complete Sequence
EG
R
V
A
N
S
D
N
F
F
100
E
G
684.36
y6
G
F
N
E
942.44
y8
%
187.08
b2
246.16
y2
F
D
S
N
A
V
R
G
1171.50
y10
1570.72(M+H) +
627.33
y5
740.28
382.19
bMax
yMax
1285.55
y11
1056.47
y9
497.21
175.12
y1
E
813.40
y7
480.27
y4
333.20
y3
E
E
612.23
924.46
1535.67
1286.66
1384.63;y12
1057.57
169.07
0
1571.60
1574.74
M/z
100
200
300
400
500
600
700
800
900
1000
1100
1200
1300
1400
1500
Sequence Tags from Asp-N Treatment
(before D)
DL
P
K
V
L
DLVKP
666.46
b6
GLL
100
D
L
LG
L
D
bMax
yMax
553.37
b5
201.00
a2
%
781.46
b7
226.06
229.03
b2
199.04
0
84.64
100
197.02
667.42
638.44
a6
376.94
325.21
307.20
424.32 535.34
457.06
894.54
b8
779.52
763.47 782.54
895.63
1069.56 1097.82
M/z
200
300
400
500
600
700
800
900
1000
•Observe mainly b ions ;b1 ions rarely observed
1100
MS/MS of EA(I/L)DFFAR
540.28
100
Expected MH+
= 484.74*2-1.0078
=968.47
484.74 (2+)
655.30
968.47-839.44 = 129.03
(i.e. Glutamic acid)
D
%
393.22
A
Full sequence:
EA(I/L)DFFAR
F
229.12
R
F
246.15
I/L
175.12
201.09
296.16
173.09
365.18
768.40
411.19
155.08
523.26
358.18
120.08
A
637.31
656.36
541.34
477.25
100
150
200
250
300
350
400
450
500
839.44
729.25
0
550
600
650
700
750
800
850
858.48
900
961.49
950
mass
1000
Expected Chemical Modifications
1. Carbamidomethyl (CAM) (-CH2-C(O)-NH2)
On cysteine after alkylation with iodo acetamide
I-CH2C(O)NH2
103 (Cys) + 58 (CAM) = 161 -1 (SH) = 160
2. Oxidation of methionine residues (air oxidation)
-CH2-CH2-S-CH3
Met
3. Deamidation
-CH2C(O)NH2
Asn, Gln
-CH2-CH2-S(O)-CH3
Met (O) + 16
-CH2C(O)OH
Asp, Glu + 1
4. Carbamylation (Lys and N-terminal NH2)
-CH2-NH-C(O)-NH2
CH2-NH2
+43
Manual or De novo Interpretation of MS/MS Data
1. Most proteins are analyzed by MS/MS after trypsin
digestion (unless otherwise specified eg Lys-C and Asp N)
2. A parent ion (usually doubly charged) is selected
by the first quadrupole. A neutral gas is introduced in the
collision cell and causes fragmentation along the backbone
producing b and y ions.
Each b ions will differ by the mass of one amino acid residue.
Each y ion will differ by the mass of one amino acid residue.
(see movie MS/MS tutorial at
http://www.mshri.on.ca/pawson/ms/movie.html
3. Since trypsin is used the C-terminal amino acid must be Arg or Lys,
y1
y1 ion at 175 or 147
O
H
N
H
C
C
Arg (Lys)
OH
Interpretation of MS/MS Data …
4. The mass of the peptide can be calculated form the doubly
charged ion = x 2- 2H)
5. The b and y series may not be complete creating gaps in the
sequence. The gaps can often be identified or partially
identified by the sum of the mass of two amino acids
b and especially y ions can loose H2O so the mass of the
amino acid -18 is detected in the MS
6. b2 ions are often observed with a diagnostic -28 a2 ion; b1 ions are rare.
b2 ion allows us to determine yn-2 ion, since M + 2 = b2 + yn-2
n is the maximum number of possible y ions
smallest b2 =115 (Gly, Gly) largest b2 = 373 (Trp,Trp)
7. There many mass equivalence. The two most common are oxidized
methionine Met(o) = 147.04 and Phe =147.07 and
Cys (Cam) and CysGly (CG) = 160.03
Bioinformatics
Databases- several types:
-DNA sequences, proteins sequences
-EST (expressed sequence tags) (more prone errors)
-2D Gels, 3-D structure, post-translational modifications
-Annotations: forms, function, etc.
Protein Databases: (use more than to increase confidence)
SwissProt (best)
NCBInr
OWL
Search Engines
Mascot: masses, sequence tags, MS/MS data
Profound: masses, sequence tags, MS/MS data
MS-Fit: masses, sequence tags, MS/MS data,
homology
Protein Links Global Server (PLGS, Micromass)
Strategies to ID proteins with MS/MS data
Need to determine sequence of tryptic peptides
- for de novo sequencing of unknown organisms
- to obtain partial sequences for database searches
“Sequence Tags”: get much better results
Algorithms to determine sequence are poor and
determination can be slow when done manually.
Solution:
search databases with uninterpreted MS/MS
data against virtual (in silico) MS/MS of peptide
in database (MASCOT from www.matrixscience.com)
or SEQUEST or X!Tandem
Web Tools
• Peptide Mass Fingerprinting
• Mascot (Peptide Mass Fingerprint):
http://www.matrixscience.com
• MassSearch:
http://cbrg.inf.ethz.ch/Server/MassSearch.html
• MOWSE:
http://www.hgmp.mrc.ac.uk/Bioinformatics/Webapp/mow
se
• MS-Fit: http://prospector.ucsf.edu
• PeptIdent: http://us.expasy.org/tools/peptident.html
• Peptide Search: http://www.mann.emblheidelberg.de/GroupPages/Homepage.html
• Profound: http://prowl.rockefeller.edu
• PepMapper: http://wolf.bms.umist.ac.uk/mapper/
Approaches to Identify Proteins from MS Data
1. Masses of digested peptides compared with in silico digests
of protein databases (mass fingerprinting)
Unreliable
2. Compare uninterpreted MS/MS data with in silico MS/MS of digested
proteins in databases (eg MASCOT)
Problems:
- Too many false hits
- Need known genomes
3. Search databases with partial sequences (sequence tags)
Much better for known and unknown genomes
Problems:
- Long and tedious to determine sequence manually
- Inaccurate software available until PEAKS
Download