CHEM 4202 Homework 1 Due 2/18/15

advertisement
CHEM 4202 Homework 1
Due 2/18/15
NAME__________________________
Calculations of percent ionization of a particular functional group are vital to determining proper protein interactions.
We have learned that using the Henderson-Hasselbalch equation can be useful for any specific buffer. However, we can
expand this calculation to take into account ALL ionizable groups in a molecule. This exercise will help us better
understand those calculations, and with the help of a computer, make many calculations fast.
Our first order of business is to remember where the Henderson-Hasselbalch equation comes from.
Given a simple acid equilibrium: RCOOH ↔ H+ + RCOOThe equilibrium constant (or acid dissociation constant) for the reaction expressed by:
πΎπ‘Ž =
[𝐻 + ][𝑅𝐢𝑂𝑂 − ]
[𝑅𝐢𝑂𝑂𝐻]
(1)
where Ka is the acid dissociation constant.
If we divide both sides of eq 1 by ([H+]Ka) we obtain the following:
1
[𝐻 + ]
1
[𝑅𝐢𝑂𝑂− ]
= 𝐾 ([𝑅𝐢𝑂𝑂𝐻])
(2)
π‘Ž
Taking the log of both sides of eq 2 yields:
1
[𝑅𝐢𝑂𝑂− ]
1
π‘™π‘œπ‘” [𝐻 +] = π‘™π‘œπ‘” (𝐾 ([𝑅𝐢𝑂𝑂𝐻]))
(3)
π‘Ž
Since log(xy) = logx + logy and log(x/y) = logx – logy, we can rewrite eq 3:
[𝑅𝐢𝑂𝑂− ]
π‘™π‘œπ‘”1 − π‘™π‘œπ‘”[𝐻 + ] = π‘™π‘œπ‘”1 − π‘™π‘œπ‘”πΎπ‘Ž + π‘™π‘œπ‘” ([𝑅𝐢𝑂𝑂𝐻])
(4)
Finally, we simplify eq 4 by removing the removing the “log1” terms (since log1 = 0) and substituting in the working
definitions that pH = ο€­log[H+] and pKa = ο€­logKa:
[𝑅𝐢𝑂𝑂 − ]
𝑝𝐻 = π‘πΎπ‘Ž + π‘™π‘œπ‘” [𝑅𝐢𝑂𝑂𝐻]
(5)
Eq 5 is a form of the Henderson-Hasselbalch equation, which has been modified to apply to our example.
It would be really useful to us if we could look at the overall charge of this species. Yes, our acid has no charge, and the
conjugate base has a charge of -1, but how much is charged at different pH values? Since we are interested in the best
way to represent our charged compound at different pH values, we see that if we isolate the ratio [RCOOο€­]/[RCOOH] we
can make progress in this regard. By taking the inverse log of both sides of eq 5 we get:
[𝑅𝐢𝑂𝑂− ]
[𝑅𝐢𝑂𝑂𝐻]
= 10(𝑝𝐻−π‘πΎπ‘Ž)
(6)
In our example spreadsheet, there is a column marked pH. Enter pH values in 0.1 increments from 0-14. (HINT: Use
functions to make the process faster.) In the cell beside the pKa=, type your pK value. Let’s assume that for a generic
acid the value is 5. In the ratio column, we can now use our eq 6 to calculate the ratio of charged conjugate base to
uncharged acid. Remember, to enter a formula we must start with “=”. Since our ratio is equal to 10^pH-pK, we can use
cell references to make our calculation quick. (HINT: “$” in front of letter/number cell references lock that reference.
CHEM 4202 Homework 1
Due 2/18/15
NAME__________________________
This way, when you drag your formula, you will not change certain aspects of the equation). Fill in the entire column for
the ratio at each pH value.
What do you notice about the values that you obtained? Do they make sense? What do they mean?
While the calculation above may provide some information to us, ultimately, a percent charged would be more helpful.
To do this, we must reexamine eq 6. A percent would be part out of the whole, but we are only looking at a ratio. Really
what we want is:
[𝑅𝐢𝑂𝑂− ]
[𝑅𝐢𝑂𝑂𝐻]+[𝑅𝐢𝑂𝑂 − ]
(7)
Without going into all of the math, we can get this result by manipulating eq 6 to be:
[𝑅𝐢𝑂𝑂− ]
[𝑅𝐢𝑂𝑂𝐻]+[𝑅𝐢𝑂𝑂− ]
=
10(𝑝𝐻−π‘πΎπ‘Ž )
10(𝑝𝐻−π‘πΎπ‘Ž ) +1
(8)
Use eq 8 to fill in the fraction column. What do you notice about these values? Does it make sense?
In reality, if we are looking at the fraction charged, we would want the value to be negative since the charge is negative.
Therefore, you can fill in the (-)fraction column with the negative of the fraction column.
You should now have all of the columns filled in for a general acid. Next, we need to look at what happens with a base.
We must reexamine our equations because our base will start charged positive before it loses its proton to become
neutral. Therefore, our starting equation will look a little different.
𝑝𝐻 = π‘πΎπ‘Ž + π‘™π‘œπ‘”
[𝑅𝑁𝐻2 ]
(9)
[𝑅𝑁𝐻3+ ]
The problem is that we want the charged species in the numerator since we want fraction charged. We can fix this
problem by flipping the ratio. Remember your log math, this will then switch all of the signs in the equation to look like:
[𝑅𝑁𝐻 + ]
−𝑝𝐻 = −π‘πΎπ‘Ž + π‘™π‘œπ‘” [𝑅𝑁𝐻3 ]
2
(10)
If we want to solve for our charged ratio as we did for our acid, we end up with an equation that look like this:
[𝑅𝑁𝐻3+ ]
[𝑅𝑁𝐻2 ]
= 10(π‘πΎπ‘Ž −𝑝𝐻)
(11)
After you have filled in the pH column for your general base and picked a pK value (I suggest 10 for practice), you can put
your new formula in from eq 11 for the ratio. What do you notice about the values? How do they compare to the
values you obtained for the acid ratios?
If we want to fill in our fraction charged column for the base, we must make our equation look similar to eq 8. Doing the
same substitutions we should end up with:
[𝑅𝑁𝐻3+ ]
[𝑅𝑁𝐻2 ]
=
10(π‘πΎπ‘Ž −𝑝𝐻)
10(π‘πΎπ‘Ž−𝑝𝐻) +1
(12)
Use eq 12 to fill in your fraction charged column. What do you notice about your values? How do they compare to the
acid fraction we did earlier? Ultimately, our charged fraction column (+1)fraction is the same values as our fraction
column since the fraction carries a positive charge.
CHEM 4202 Homework 1
Due 2/18/15
NAME__________________________
Now that we understand how to determine the fraction charged, we can apply the same exact concept to any chemical
even if it has more than one ionizable group. For example, what if we wanted to know what the overall charge of an
amino acid was at any pH? Remember, amino acids are zwitterions, meaning they have both a positive and negative
charged group. However, they many not both be charged at all pH values. How would we figure that out? Well, we
could determine the fraction charged of each ionizable group and then add them together.
Let’s look at the amino acid alanine. We can look up the pK1 and pK2 values from our book. We can also fill in the pH
column. What we now want to do is determine what information is important from our general acid or general base
equations. If we want to fill in the COO- column, we could imagine that this is just like the general acid where we are
creating a charged conjugate base. Therefore, we just have to enter the formula, eq 8, that we did for the fraction of
general acid charged into that column. Remember, we want the value to be negative so that we are saying it has a
negative charge. You can do this in one cell by putting your negative sign in front of the equation. The next column is
for the NH3+ portion of alanine. This looks like our general base, where the acid is charged with a neutral conjugate
base. Therefore, we want to put eq 12 into this column. Remember, we want this to be a positive value representing
the positive charge. If we want to know the overall, or net charge, for alanine at any pH, all we need to do is sum the
charges of the pieces at each pH. In the net charge column, add the values of the two charged fractions.
We now have the ability to find the charge at any pH. We will ultimately want to know when the amino acid, or later the
protein, has zero net charge. Due to the limited precision of this example spreadsheet, we will not see zero in the net
charge column. However, we can find where zero would be, between the last positive value and the first negative value.
Find the pH where the net charge is zero for alanine. This is the pI, or isoelectric point.
We can calculate the pI for any of our amino acids if we examine all of the parts. The example spreadsheet is set up for
you to try glutamic acid and lysine. Both have pKR values that will have to be included in the calculation. How will you
determine which equation to use? You can check your work by looking up the pI online to make sure that you are
correct for these amino acids.
Now that we have an understanding of how we can calculate net charge, let us examine how the calculation would look
for an entire protein. Open the pK values spreadsheet. This spreadsheet gives us a table with all pK values pre-entered
so that we do not need to enter them. In addition, the net charge column has a calculation in it.
We first need to understand the calculation. If you click on the first calculation box under net charge you can see the
formula. Other than being quite long, it should look familiar. On our practice spreadsheet, we had an independent
column for each possible fraction charged. In this spreadsheet, the calculation has been combined into one cell. What
needs to be calculated in that cell? Well, let’s think about all of the possible ways that a protein can ionize. There are
pK1, pK2, and pKR values. If we think about pK1 and pK2, those are not going to matter much, as they will not be ionizable
in a protein due to peptide bonds. We do however need to consider the first amine group and the last carboxylic group,
as they are not involved in peptide bonds. The first amine group is the first part of the calculation; the carboxylic group
is the second part of the calculation. Each part of the formula is separated by a “+”, since we need to add all of the
charges together. The remaining portions of the calculation are for each R group that is ionizable. They are in
alphabetical order by one letter abbreviation. Our formula is set up to calculate the contribution of each ionizable
group.
But what if we have the same ionizable group more than one time in a protein, such as 3 lysines? Well, we can use the
spreadsheet to add each of the occurrences of that charge, or multiply that portion by the number of time the amino
acid appears in the sequence. We can actually have the spreadsheet count the number of times each amino acid occurs
in the sequence too. This makes our spreadsheet almost automatic. If we paste the sequence of the protein into cell
B33, the number of occurrences of each amino acid will self-populate below. These numbers are then used to multiply
by the fraction charged in the formula. Can you find where each amino acid is represented in the formula? Why are
there only certain amino acids listed below the sequence? Now you only need to change the reference for the first and
last amino acids in the sequence, since the spreadsheet does not automatically do that.
CHEM 4202 Homework 1
Due 2/18/15
NAME__________________________
Now that you have a feel for the full spreadsheet to calculate pI of a protein, let’s do some problems.
Use the following protein sequence to answer questions 1-3
ALIEDKMACRGNCTPG
1. Using Excel, calculate the pI for the protein to three decimal places. (HINT: this does not mean that I just want
you to add zeros!) (2 pts)
2. We know that two sulfhydryl groups can bond together to form disulfide bonds. Let us assume that the
sequence forms disulfide bonds. How does this change your calculation (answer in words)? What is the new
value of the pI for the protein to three decimal places? (4 pts)
3. Let us now say that our protein has two chains that are linked by the disulfide bond. Chain one is from the A-R
and Chain two is from G-G. How does this change your calculation (answer in words)? What is the new value of
the pI for the protein to three decimal places? (4 pts)
4. Turn in your completed Example Titration spreadsheet. This can be done electronically. (9 pts)
5. In the real world, we may want to calculate the pI of a protein. While we do not know yet how to find protein
sequences, I have included several for you to use. Please calculate the pI of each protein. (6 pts)
alpha-amylase [Drosophila melanogaster]
MFLAKSIVCLALLAVANAQFDTNYASGRSGMVHLFEWKWDDIAAECENFLGPNGYAGVQVSPVNENAVKDSRPWWERYQPISYKLETRS
GNEEQFASMVKRCNAVGVRTYVDVVFNHMAADGGTYGTGGSTASPSSKSYPGVPYSSLDFNPTCAISNYNDANEVRNCELVGLRDLNQG
NSYVQDKVVEFLDHLIDLGVAGFRVDAAKHMWPADLAVIYGRLKNLNTDHGFASGSKAYIVQEVIDMGGEAISKSEYTGLGAITEFRHS
DSIGKVFRGKDQLQYLTNWGTAWGFAASDRSLVFVDNHDNQRGHGAGGADVLTYKVPKQYKMASAFMLAHPFGTPRVMSSFSFTDTDQG
PPTTDGHNIASPIFNSDNSCSGGWVCEHRWRQIYNMVAFRNTVGSDEIQNWWDNGSNQISFSRGSRGFVAFNNDNYDLNSSLQTGLPAG
TYCDVISGSKSGSSCTGKTVTVGSDGRASINIGSSEDDGVLAIHVNAKL
son of sevenless protein [Drosophila melanogaster]
MFSGPSGHAHTISYGGGIGLGTGGGGGSGGSGSGSQGGGGGIGIGGGGVAGLQDCDGYDFTKCENAARWRGLFTPSLKKVLEQVHPRVT
AKEDALLYVEKLCLRLLAMLCAKPLPHSVQDVEEKVNKSFPAPIDQWALNEAKEVINSKKRKSVLPTEKVHTLLQKDVLQYKIDSSVSA
FLVAVLEYISADILKMAGDYVIKIAHCEITKEDIEVVMNADRVLMDMLNQSEATSCPVPCHFPRSASATYEETVKELIHDEKQYQRDLH
MIIRVFREELVKIVSDPRELEPIFSNIMDIYEVTVTLLGSLEDVIEMSQEQSAPCVGSCFEELAEAEEFDVYKKYAYDVTSQASRDALN
NLLSKPGASSLTTAGHGFRDAVKYYLPKLLLVPICHAFVYFDYIKHLKDLSSSQDDIESFEQVQGLLHPLHCDLEKVMASLSKERQVPV
SGRVRRQLAIERTRELQMKVEHWEDKDVGQNCNEFIREDSLSKLGSGKRIWSERKVFLFDGLMVLCKANTKKQTPSAGATAYDYRLKEK
YFMRRVDINDRPDSDDLKNSFELAPRMQPPIVLTAKNAQHKHDWMADLLMVITKSMLDRHLDSILQDIERKHPLRMPSPEIYKFAVPDS
GDNIVLEERESAGVPMIKGATLCKLIERLTYHIYADPTFVRTFLTTYRYFCSPQQLLQLLVERFNIPDPSLVYQDTGTAGAGGMGGVGG
DKEHKNSHREDWKRYRKEYVQPVQFRVLNVLRHWVDHHFYDFEKDPMLLEKLLNFLEHVNGKSMRKWVDSVLKIVQRKNEQEKSNKKIV
YAYGHDPPPIEHHLSVPNDEITLLTLHPLELARQLTLLEFEMYKNVKPSELVGSPWTKKDKEVKSPNLLKIMKHTTNVTRWIEKSITEA
ENYEERLAIMQRAIEVMMVMLELNNFNGILSIVAAMGTASVYRLRWTFQGLPERYRKFLEECRELSDDHLKKYQERLRSINPPCVPFFG
RYLTNILHLEEGNPDLLANTELINFSKRRKVAEIIGEIQQYQNQPYCLNEESTIRQFFEQLDPFNGLSDKQMSDYLYNESLRIEPRGCK
TVPKFPRKWPHIPLKSPGIKPRRQNQTNSSSKLSNSTSSVAAAAAASSTATSIATASAPSLHASSIMDAPTAAAANAGSGTLAGEQSPQ
HNPHAFSVFAPVIIPERNTSSWSGTPQHTRTDQNNGEVSVPAPHLPKKPGAHVWANNNSTLASASAMDVVFSPALPEHLPPQSLPDSNP
FASDTEAPPSPLPKLVVSPRHETGNRSPFHGRMQNSPTHSTASTVTLTGMSTSGGEEFCAGGFYFNSAHQGQPGAVPISPHVNVPMATN
MEYRAVPPPLPPRRKERTESCADMAQKRQAPDAPTLPPRDGELSPPPIPPRLNHSTGISYLRQSHGKSKEFVGNSSLLLPNTSSIMIRR
NSAIEKRAAATSQPNQAAAGPISTTLVTVSQAVATDEVLPLPISPAASSSTTTSPLTPAMSPMSPNIPSHPVESTSSSYAHQLRMRQQQ
QQQTHPAIYSQHHQHHATHLPHHPHQHHSNPTQSRSSPKEFFPIATSLEGTPKLPPKPSLSANFYNNPDKGTMFLYPSTNEE
collagen type IV, isoform C [Drosophila melanogaster]
MLPFWKRLLYAAVIAGALVGADAQFWKTAGTAGSIQDSVKHYNRNEPKFPIDDSYDIVDSAGVARGDLPPKNCTAGYAGCVPKCIAEKG
NRGLPGPLGPTGLKGEMGFPGMEGPSGDKGQKGDPGPYGQRGDKGERGSPGLHGQAGVPGVQGPAGNPGAPGINGKDGCDGQDGIPGLE
GLSGMPGPRGYAGQLGSKGEKGEPAKENGDYAKGEKGEPGWRGTAGLAGPQGFPGEKGERGDSGPYGAKGPRGEHGLKGEKGASCYGPM
KPGAPGIKGEKGEPASSFPVKPTHTVMGPRGDMGQKGEPGLVGRKGEPGPEGDTGLDGQKGEKGLPGGPGDRGRQGNFGPPGSTGQKGD
RGEPGLNGLPGNPGQKGEPGRAGATGKPGLLGPPGPPGGGRGTPGPPGPKGPRGYVGAPGPQGLNGVDGLPGPQGYNGQKGGAGLPGRP
GNEGPPGKKGEKGTAGLNGPKGSIGPIGHPGPPGPEGQKGDAGLPGYGIQGSKGDAGIPGYPGLKGSKGERGFKGNAGAPGDSKLGRPG
TPGAAGAPGQKGDAGRPGTPGQKGDMGIKGDVGGKCSSCRAGPKGDKGTSGLPGIPGKDGARGPPGERGYPGERGHDGINGQTGPPGEK
GEDGRTGLPGATGEPGKPALCDLSLIEPLKGDKGYPGAPGAKGVQGFKGAEGLPGIPGPKGEFGFKGEKGLSGAPGNDGTPGRAGRDGY
CHEM 4202 Homework 1
Due 2/18/15
NAME__________________________
PGIPGQSIKGEPGFHGRDGAKGDKGSFGRSGEKGEPGSCALDEIKMPAKGNKGEPGQTGMPGPPGEDGSPGERGYTGLKGNTGPQGPPG
VEGPRGLNGPRGEKGNQGAVGVPGNPGKDGLRGIPGRNGQPGPRGEPGISRPGPMGPPGLNGLQGEKGDRGPTGPIGFPGADGSVGYPG
DRGDAGLPGVSGRPGIVGEKGDVGPIGPAGVAGPPGVPGIDGVRGRDGAKGEPGSPGLVGMPGNKGDRGAPGNDGPKGFAGVTGAPGKR
GPAGIPGVSGAKGDKGATGLTGNDGPVGGRGPPGAPGLMGIKGDQGLAGAPGQQGLDGMPGEKGNQGFPGLDGPPGLPGDASEKGQKGE
PGPSGLRGDTGPAGTPGWPGEKGLPGLAVHGRAGPPGEKGDQGRSGIDGRDGINGEKGEQGLQGVWGQPGEKGSVGAPGIPGAPGMDGL
PGAAGAPGAVGYPGDRGDKGEPGLSGLPGLKGETGPVGLQGFTGAPGPKGERGIRGQPGLPATVPDIRGDKGSQGERGYTGEKGEQGER
GLTGPAGVAGAKGDRGLQGPPGASGLNGIPGAKGDIGPRGEIGYPGVTIKGEKGLPGRPGRNGRQGLIGAPGLIGERGLPGLAGEPGLV
GLPGPIGPAGSKGERGLAGSPGQPGQDGFPGAPGLKGDTGPQGFKGERGLNGFEGQKGDKGDRGLQGPSGLPGLVGQKGDTGYPGLNGN
DGPVGAPGERGFTGPKGRDGRDGTPGLPGQKGEPGMLPPPGPKGEPGQPGRNGPKGEPGRPGERGLIGIQGERGEKGERGLIGETGNVG
RPGPKGDRGEPGERGYEGAIGLIGQKGEPGAPAPAALDYLTGILITRHSQSETVPACSAGHTELWTGYSLLYVDGNDYAHNQDLGSPGS
CVPRFSTLPVLSCGQNNVCNYASRNDKTFWLTTNAAIPMMPVENIEIRQYISRCVVCEAPANVIAVHSQTIEVPDCPNGWEGLWIGYSF
LMHTAVGNGGGGQALQSPGSCLEDFRATPFIECNGAKGTCHFYETMTSFWMYNLESSQPFERPQQQTIKAGERQSHVSRCQVCMKNSS
projectin [Drosophila melanogaster]
VKAINAAGPGEPSDASKPIITKPRKLAPKILDPTKNIRTYNFKSGEPIFLDINISGEPAPDVTWNQNNKSVQTTSFSHIENLPYNTKYI
NNNPERKDTGLYKISAHNFYGQDQVEFQINIITKPGKPGGPLEVSEVHKDGCKLKWKKPKDDGGEPVESYLVEKFDPDTGIWLPVGRSD
GPEYNVDGLVPGHDYKFRVKAVNKEGESEPLETLGSIIAKDPFSVPTKPGVPEPTDWTRNKVELAWPEPASDGGSPIQGYIVEVKDKYS
PLWEKALETNSPTPTATVQGLIEGNEYQFRVVALNKGGLSEPSDPSKIFTAKPRYLAPKIDRRNLRNITLSSGTALKLDANITGEPAPK
VEWKLSNYHLQSGKNVTIETPDYYTKLVIRPTQRSDSGEYLVTATNTSGKDSVLVNVVITDKPSPPNGPLQISDVHKEGCHLKWKAPSD
DGGTPIEYFQIDKLEPETGCWIPSCRSTEPQVDVTGLSPGNEYKFRVSAVNAEGESQPLVGDESIVARNPFDEPGKPENLKATDWDKDH
VDLAWTPPLIDGGSPISCYIIEKQDKYGKWERALDVPADQCKATIPDLVEGQTYKFRVSAVNAAGTGEPSDSTPPIIAKARNKPPIIDR
SSLVEVRIKAGQSFTFDCKVSGEPAPQTKWLLKKKEVYSKDNVKVTNVDYNTKLKVNSATRSDSGIYTVFRENANGEDSADVKVTVIDK
PAPPNGPLKVDEINSESCTLHWNPPDDDGGQPIDNYVVGKLDETTGRWMTAGETDGPVTALKVGGLTPGHKYKFRVRAKNAQGTSEPLT
TAQAIIAKNPFDVPTKPGTPTIKDFDKEFVDLEWTRPEADGGSPITGYVVEKRDKFSPDWEKCAEISDDITNAHVPDLIEGLKYEFRVR
AVNKAGPGSPSDATETHVARPKNTPPKIDRNFMSDIKIKAGNVFEFDVPVTGEPLPSKDWTHEGNMIINTDRVKISNFDDRTKIRILDA
TSDTGVYTLTARNINGTDRHNVKVTILDAPSVPEPALRNGDVSKNSIVLRWRPPKDDGGSEITHYVVEKMDNEAMRWVPVGDCTDTEIR
ADNLIENHDYSFRVRAVNKQGQSQPLTTSQPITAKDPYSHPDKPGQPQATDWGKHFVDLEWSTPKRDGGAPISSYIIEKRPKFGQWERA
AVVLGDNCKAHVPELTNGGEYEFRVIAVNRGGPSDPSDPSSTIICKPRFLAPFFDKSLLNDITVHAGKRLGWTLPIEASPRPLITWLYN
GKEIGSNSRGESGLFQNELTFEIVSSLRSDEGRYTLILKNEHGSFDASAHATVLDRPSPPKGPLDITKITRDGCHLTWNVPDDDGGSPI
LHYIIEKMDLSRSTWSDAGMSTHIVHDVTRLVHRKEYLFRVKAVNAIGESDPLEAVNTIIAKNEFDEPDAPGKPIITDWDRDHIDLQWA
VPKSDGGAPISEYIIQKKEKGSPYWTNVRHVPSNKNTTTIPELTEGQEYEFRVIAVNQAGQSEPSEPSDMIMRKPRYLPPKIITPLNEV
RIKCGLIFHTDIHFIGEPAPEATWTLNSNPLLSNDRSTITSIGHHSVVHTVNCQRSDSGIYHLLLRNSSGIDEGSFELVVLDRPGPPEG
PMEYEEITANSVTISWKPPKDNGGSEISSYVIEKRDLTHGGGWVPAVNYVSAKYNHAVVPRLLEGTMYELRVMAENLQGRSDPLTSDQP
VVAKSQYTVPGAPGKPELTDSDKNHITIKWKQPISNGGSCRVDLQACKLGT
Download