Barbara Falkowski Falgun Patel Celia Smith Molecular Weight Determination of Unknown Proteins

advertisement
Molecular Weight Determination of Unknown Proteins
for
NASA/JPL PAIR Program
August 24, 2001
Barbara Falkowski
Falgun Patel
Celia Smith
The Overall Goal
 To determine molecular weight of
unknown electrophoresis data
Method to Achieve the Goal
 Measure distances of
unknown standards
with PhotoShop and
Spotviewer
 Decide whether
Spotviewer or
Photoshop is the
better measuring tool.
 Run models on
standard proteins
 Decide which
model(s) work the
best for the standards
 Run model(s) on
unknown proteins.
Decide which model(s) worked the best on the
unknowns
SpotViewer Disadvantages
 Did not measure dye-front distance

One needed to go into Photoshop to mark or crop
the dye-front distance.
 Spotviewer missed bands
 Did not always pick up bands that were thin,
blurry or close together.
 Sometimes gave two measurement values to
one band

Or gave values that were associated with any
band.
 Did not pick up very light bands.
PhotoShop Advantages
 Did not need assistance from another
program.

Not as time consuming
 Light bands could be more easily discerned
through color inversion/manipulation of the
image.

This also worked well with tightly packed, thin
and blurred bands.
Models
Tested
Gels/Protein
Used
 Quadratic Regression
 Quadratic Cross
Validation
 SLIC
 Log-Linear Model
 Log-Log Model
 Local Linear Model
 Quadratic
Interpolation
 Vitelline Envelopes
(VE) for two species
(Strongylocentrotus
purpuratus and
Lytechinus pictus)
 Vitelline Envelopes
for two methods (DTT
and mechanically
isolated)
Which model worked the best?
 No single model was best for all of the gels.
 It was found that different models worked
better for different gels.
Quadratic Regression Model - 15 % Gel #1 S.purp/L.pictus
VE DTT Removal
SLIC Model - Gradient Gel #2
Jelly + Seminal Plasma + VE Time Courses
LOG-LOG Model - 12. 5% Gels
Gel #4 VE + Tris Supernatant Time Course and Gel # 6
VE + Tris Pellet Time Course
Why was the Quadratic Model chosen for the Gel #1?
Gel #1 15%SLIC
Gel S. purp/L..pictus
SLICVE DTT Removal
Quadratic Regression
Quadratic Regression
Lane 2
Residuals
Residuals Squared
Residuals
Residuals Squared
Band 1
0.01132444
0.00012824
0.03853645
0.00148506
Band 2
0.00300890
0.00000905
0.00309766
0.00000960
Band 3
0.00109427
0.00000120
0.02988180
0.00089292
Band 4
0.00827156
0.00006842
0.05186572
0.00269005
Band 5
0.00688821
0.00004745
0.04430756
0.00196316
Band 6
0.00783585
0.00006140
0.02173100
0.00047224
Band 7
0.00975655
0.00009519
0.01972984
0.00038927
Sums
0.00000000
0.00041095
0.00000000
0.00790229
R-Squared
0.92768402
0.98858508
Gel #1 15%LOG-LOG
Gel S. purp/L..pictus
LOG-LOG
VE DTT Removal
Quadratic Interpolation
Quadratic Interpolation
Lane 2
Residuals
Residuals Squared
Residuals
Residuals Squared
Band 1
0.00026347
0.00000007
0.10070348
0.01014119
Band 2
0.01672254
0.00027964
0.01238500
0.00015339
Band 3
0.00637001
0.00004058
0.00503027
0.00002530
Band 4
0.03252705
0.00105801
0.01377812
0.00018984
Band 5
0.03261521
0.00106375
0.08413179
0.00707816
Band 6
0.00057092
0.00000033
0.03163810
0.00100097
Band 7
0.01074814
0.00011552
0.12022332
0.01445365
Sums
0.00000000
0.00255790
0.09314636
0.00867624
R-Squared
0.91630500
Gel #1 15%LOG
Gel Linear
S. purp/L..pictus
LOG VE
Linear
DTT Removal
Local Linear
Local Linear
Lane 2
Residuals
Residuals Squared
Residuals
Residuals Squared
Band 1
0.06289249
0.00395547
0.02458821
0.00060458
Band 2
0.10387439
0.01078989
0.02458821
0.00060458
Band 3
0.07983165
0.00637309
0.00236728
0.00000560
Band 4
0.03433984
0.00117922
0.01533503
0.00023516
Band 5
0.04476902
0.00200427
0.03416853
0.00116749
Band 6
0.02631810
0.00069264
0.02429362
0.00059018
Band 7
0.08820824
0.00778069
0.03316659
0.00110002
Sums
0.21118107
0.03277527
0.15850747
0.02512462
R-Squared
0.955670994
0.97855752
•Took Quadratic Regression of standards to find the intercept and
coefficients.
• Used the intercept and coefficients in the equation:
LOG MW = RM^2*a +RM*b +c
Sea Urchins
Intercept 5.072481
Gel #1 15% Gel
PSS.RMpurp/L..pictus
Square RMVEAverage
-1.10049
DTT Removal
Lane 2
Average Square RMMolecular
Average
Log Molecular
Predicted
Weight Weight
Log Molecular
Residuals
Weight
Residuals Squared
Coefficients
Band 1
0.09
0.008 200000
5.30
5.26
0.04
0.001 Intercept 5.445671
Band 2
0.19
0.038 116500
5.07
5.07
0.00
0.000 Average -2.13847
Band 3
0.22
0.050 97000
4.99
5.02
0.03
0.001 Square RM Average
1.017416
Band 4
0.32
0.100 66000
4.82
4.87
0.05
0.003Regression Statistics
Band 5
0.52
0.270 45000
4.65
4.61
0.04
0.002 Multiple R 0.994276
Band 6
0.67
0.449 31000
4.49
4.47
0.02
0.000 R Square 0.988585
Band 7
0.88
0.772 21500
4.33
4.35
0.02
0.000 Adjusted R Square
0.982878
Sums
0.00
0.008 Standard Error
0.044447
•Put the relative mobility of the unknowns into the equation to come
up with the following results:
Log Molecular Weight Results for 15% Gel
Band 1
Band 2
Band 3
Band 4
Band 5
Band 6
Band 7
Band 8
Band 9
Band 10
Band 11
Band 12
Band 13
Band 14
Band 15
Band 16
Band 17
Band 18
Band 19
Band 20
S. Purp
L. Pict
Lane1
Lane3
Lane 4
Lane 5
Lane 6
Lane 7
Lane 8
Lane 9
PREDLOGMWPREDLOGMWPREDLOGMWPREDLOGMWPREDLOGMWPREDLOGMWPREDLOGMWPREDLOGMW
5.287194
5.366035
5.337433
5.337433
5.370153
5.369856
5.325993
5.345953
4.959881
5.325296
5.250127
5.313233
5.211740
5.226168
5.243535
5.067651
4.751998
5.234675
5.122917
4.539243
5.031963
5.188135
5.061425
4.663953
4.624020
5.185351
5.038765
4.965727
5.047612
4.703594
4.562605
4.513194
5.105001
4.754128
4.930680
4.990248
4.639752
4.477437
4.427591
5.066297
4.640942
4.863528
4.944781
4.533252
4.439041
4.399273
4.927542
4.526304
4.837187
4.855694
4.467592
4.366838
4.362973
4.872454
4.362123
4.791981
4.795473
4.427966
4.343091
4.808690
4.764781
4.741470
4.775564
4.728064
4.715694
4.735798
4.702809
4.690736
4.712814
4.673574
4.659512
4.652414
4.645507
4.631915
4.605594
4.603454
4.614171
4.568308
4.568308
4.599074
4.499914
4.512002
4.520488
4.364580
4.451155
4.456455
4.356674
4.422551
4.423228
4.337822
4.343295
4.329330
4.326006
4.333159
4.334527
What type of Cross Validation
was done?
 Quadratic Cross Validation using relative
mobility and Log Molecular Weight
 Cross Validation was not chosen at all
 The predicted value for the missing band
was not close the the actual value in any of
the gel cases.
Results for Cross Validation Model on Standards
Gel #1 15%Gel
Gel#4
S.VE
purp/L..pictus
and
Gel #6
TrisVE
Supernatant
and
Gel
VE #2
Tris
DTT
Seminal
Pellet
Removal
TimeTime
Course
and Jelly
Cours
Band
Band
Band
Band
Band
Band
Band
Band
Band
Square Residuals
Square Residuals
Square Residuals
Square Residuals
Sums
1
0.050049 0.057916 0.043083 0.051572
0.20262
2
0.10454 0.063949 0.091593 0.114309 0.374392
3
0.139053
0.15307 0.155971 0.172042 0.620136
4
0.274772 0.315013 0.337997
0.25786 1.185643
5
0.647048 0.597523 0.601238 0.534674 2.380482
6
0.860863
0.93935
0.91457 0.923428 3.638211
7
1.219472 1.159049
1.41531 3.793831
8
1.774662 1.774662
9
1.370537 1.370537
Why was the SLIC Model chosen for the
Gradient Gel #2 ?
Residual Sum = 0.00
Residual Squared Sum = 0.00
Largest R^2 = 0.99
Why was the SLIC Model was chosen for the Gel #2?
Gel #2 Seminal andSLIC
Jelly and
Lane 1
Residuals
Band 1
0.00338646
Band 2
0.00255562
Band 3
0.00108465
Band 4
0.00613833
Band 5
0.00513082
Band 6
0.00211144
Band 7
0.00387409
Band 8
0.00788489
Band 9
0.00540511
Sums
0.00000000
R-Squared
Gel #2 Seminal and
LOG-LOG
Jelly and
Lane 1
Residuals
Band 1
0.02354931
Band 2
0.03054624
Band 3
0.02199111
Band 4
0.06126702
Band 5
0.02770156
Band 6
0.09272466
Band 7
0.11599313
Band 8
0.06878414
Band 9
0.21494843
Sums
0.00000000
R-Squared
Gel #2 Seminal LOG
and Jelly
Linear
and
Lane 1
Residuals
Band 1
1.61764360
Band 2
1.55422217
Band 3
1.51345930
Band 4
1.38880849
Band 5
1.28979944
Band 6
1.17819529
Band 7
1.05505930
Band 8
0.90411208
Band 9
0.56946702
Sums
11.07076670
R-Squared
-
VE TimeSLIC
Courses
Quadratic Regression Quadratic Regression
Residuals Squared
Residuals
Residuals Squared
0.00001147
0.04133391
0.00170849
0.00000653
0.00296709
0.00000880
0.00000118
0.01503750
0.00022613
0.00003768
0.08140173
0.00662624
0.00002633
0.02365412
0.00055952
0.00000446
0.04925389
0.00242595
0.00001501
0.10904385
0.01189056
0.00006217
0.07433042
0.00552501
0.00002922
0.15090162
0.02277130
0.00019403
0.00000000
0.05174200
0.98849460
0.97175674
VE Time
LOG-LOG
Courses
Local Quadratic
Local Quadratic
Residuals Squared
Residuals
Residuals Squared
0.00055457
0.13307584
0.01770918
0.00093307
0.02239282
0.00050144
0.00048361
0.02827159
0.00079928
0.00375365
0.05660829
0.00320450
0.00076738
0.10757500
0.01157238
0.00859786
0.01117711
0.00012493
0.01345441
0.01824890
0.00033302
0.00473126
0.12395746
0.01536545
0.04620283
0.23603719
0.05571356
0.07947863
0.10726760
0.10532374
0.95589400
VE Time
LOGCourses
Linear
Local Linear
Local Linear
Residuals Squared
Residuals
Residuals Squared
2.61677082
0.03770193
0.00142144
2.41560655
0.00384170
0.00001476
2.29055906
0.01173907
0.00013781
1.92878902
0.02347037
0.00055086
1.66358258
0.02104061
0.00044271
1.38814415
0.01748156
0.00030560
1.11315014
0.01392608
0.00019394
0.81741865
0.01381797
0.00019094
0.32429269
0.01381797
0.00019094
14.55831368
0.15683723
0.00344898
0.96081131
0.9608
Compare Values:
 SLIC Type Models:
Log( LN(MW) ) = A + B * LN( -LN(RM) )
 Compare Log Molecular Weight
X = e ^ ( LN( X ) )
 Convert Log( LN(MW) ) into Log( MW )
Log( MW) = Log( e ^ LN(MW) )
Log Molecular Weight Results for SLIC
Gel #2 Seminal and Jelly and VE Time Courses
Intercept
1.05917902
Slope
0.03963918
Lane 3
Band 1
Band 2
Band 3
Band 4
Band 5
Band 6
Band 7
Band 8
Band 9
Predicted
Predicted MW
LOG(MW of
Avg RM LN ( -LN(RM) ) MW LN ( Log(MW) ) on Stds.
Residuals Residuals^2 Unknown RM LOG(LN(MW)) Unknowns)
0.16
0.61 200000
1.09
1.08 0.0034 1.14681E-05
0.29 1.070674384
2.92
0.32
0.13 116500
1.07
1.06 0.0026 6.53117E-06
0.41 1.075431086
2.93
0.37
-0.01 97000
1.06
1.06 0.0011 1.17647E-06
0.54 1.08058418
2.95
0.44
-0.20 66000
1.05
1.05 0.0061 3.76791E-05
0.62 1.083755314
2.96
0.58
-0.61 45000
1.03
1.04 0.0051 2.63253E-05
0.71
-1.07 31000
1.01
1.02 0.0021 4.45819E-06
0.82
-1.62 21500
1.00
1.00 0.0039 1.50086E-05
0.90
-2.25 13400
0.98
0.97 0.0079 6.21715E-05
0.94
-2.78 6500
0.94
0.95 0.0054 2.92152E-05
Sums
R-Squared
0.00 0.000194034
0.99
Graph result of SLIC Model
SLIC Plot for Standards and Unknowns
Gel 2 Sem inal and Jelly + VE Tim e Courses
1.1
Log (LN(Molecular Weight))
1.08
1.06
1.04
1.02
1
0.98
0.96
0.94
0.92
-3
-2.5
-2
-1.5
-1
-0.5
Ln (-LN (Relative Moblity))
Standards for Gel 2
Uknow ns for Gel 2
0
0.5
1
Why was the LOG-LOG Model
Chosen for 12.5% Gels
 LOG-LOG Model worked best for the
12.5% Gels (Gel #4 VE + Tris Supernatant
Time Course and Gel # 6 VE + Tris Pellet
Time Course)
 Small residuals
 R^2 > .9
 Residuals did not have large sections of
positive or negative.
The Log-Log Model
 The Log-Log model is of the form:
Log(MW)=a+bLog(RM)+cLog(RM)^2
 It incorporates the Log model and the
quadratic model to make a more successful
madel.
Model Comparison
Gel #1 15% Gel S. purp/L..pictus
SLIC
Lane 2
Residuals
Band 1
Band 2
Band 3
Band 4
Band 5
Band 6
Band 7
Sums
R-Squared
Gel #1 15% Gel S. purp/L..pictus
LOG-LOG
Lane 2
Residuals
Band 1
Band 2
Band 3
Band 4
Band 5
Band 6
Band 7
Sums
R-Squared
Gel #1 15% Gel S. purp/L..pictus
LOG Linear
Lane 2
Residuals
Band 1
Band 2
Band 3
Band 4
Band 5
Band 6
Band 7
Sums
R-Squared
VE DTT RemovalSLIC
Quadratic Regression
Quadratic Regression
Residuals Squared
Residuals
Residuals Squared
0.01132444
0.00012824
0.03853645
0.00148506
0.00300890
0.00000905
0.00309766
0.00000960
0.00109427
0.00000120
0.02988180
0.00089292
0.00827156
0.00006842
0.05186572
0.00269005
0.00688821
0.00004745
0.04430756
0.00196316
0.00783585
0.00006140
0.02173100
0.00047224
0.00975655
0.00009519
0.01972984
0.00038927
0.00000000
0.00041095
0.00000000
0.00790229
0.92768402
0.98858508
VE DTT RemovalLOG-LOG
Quadratic Interpolation
Quadratic Interpolation
Residuals Squared
Residuals
Residuals Squared
0.00026347
0.00000007
0.10070348
0.01014119
0.01672254
0.00027964
0.01238500
0.00015339
0.00637001
0.00004058
0.00503027
0.00002530
0.03252705
0.00105801
0.01377812
0.00018984
0.03261521
0.00106375
0.08413179
0.00707816
0.00057092
0.00000033
0.03163810
0.00100097
0.01074814
0.00011552
0.12022332
0.01445365
0.00000000
0.00255790
0.09314636
0.00867624
0.91630500
VE DTT RemovalLOG Linear
Local Linear
Local Linear
Residuals Squared
Residuals
Residuals Squared
0.06289249
0.00395547
0.02458821
0.00060458
0.10387439
0.01078989
0.02458821
0.00060458
0.07983165
0.00637309
0.00236728
0.00000560
0.03433984
0.00117922
0.01533503
0.00023516
0.04476902
0.00200427
0.03416853
0.00116749
0.02631810
0.00069264
0.02429362
0.00059018
0.08820824
0.00778069
0.03316659
0.00110002
0.21118107
0.03277527
0.15850747
0.02512462
0.955670994
0.97855752
Predictions
Relative Mobilities
0.043103 0.034483
0.12069
0.12931
0.241379 0.241379
0.310345 0.310345
0.431034 0.431034
0.5 0.517241
0.551724 0.543103
0.603448 0.568966
0.637931 0.603448
0.689655 0.689655
Predicted MW
680.5166 869.4138
219.76
203.73
102.6781 102.6781
77.92182 77.92182
54.32988 54.32988
46.16116 44.47475
41.43274 42.15527
37.55081 40.05647
35.32852 37.55081
32.43066 32.43066
0.042017
0.12605
0.193277
0.243697
0.428571
0.504202
0.529412
0.554622
0.596639
0.672269
699.8616
209.5214
131.0499
101.6064
54.67275
45.73904
43.35362
41.19515
38.02157
33.35257
0.033898
0.127119
0.237288
0.322034
0.432203
0.5
0.533898
0.550847
0.59322
0.677966
885.8835
207.5894
104.6232
74.8224
54.16859
46.16116
42.95384
41.50513
38.26216
33.04501
0.033898
0.118644
0.237288
0.305085
0.423729
0.508475
0
0.542373
0.59322
0.669492
0.033898
0.118644
0.220339
0.305085
0.40678
0.508475
0
0.542373
0.584746
0.677966
3135.675
840.6008
441.0293
332.2724
217.5011
182.5361
169.8955
160.3076
149.1631
131.2589
Average
783.9189
210.1502
110.2573
83.0681
54.37527
45.63403
42.47387
40.07689
37.29076
32.81472
Conclusion
Different models worked better on different on
certain gel types. The Quadratic Regression
Model on the 15% gel, SLIC Model for the
gradient gel and the LOG-LOG Model
worked best for 12.% gels. This process
could be much improved if there was more
data on the different gel types.
Thank You
Open for Questions…
Download