Amino Acids “When you understand the amino acids, you understand everything” Residue properties determine the alignment scores Hydrophobicity is the most important characteristic of amino acids. It is the hydrophobic effect that drives proteins towards folding. Actually, it is all done by water. Water does not like hydrophobic surfaces. When a protein folds, exposed hydrophobic side chains get buried, and release water of its sad duty to sit against the hydrophobic surfaces of these side chains. Water is very happy in bulk water because there it has on average 3.6 H-bonds and about six degrees of freedom. So, whenever we discuss protein structure, folding, and stability, it is all the entropy of water, and that is called the hydrophobic effect. Residue properties determine the alignment scores When hydrophobic objects come together in water, the number of unhappy waters go down, and that is good for stability. Free waters are happy waters. HYDROPHOBICITY Residue properties determine the alignment scores Aliphatic/hydrophobic Polar Alcoholic Sulfur-containing Aromatic Charged Special Helix lovers Strand lovers Turn lovers Ala, Leu, Ile, Val Asn, Gln Ser, Thr, (Tyr) Met, Cys Phe, Tyr, Trp, (His) Arg, Lys, Asp, Glu, (His) Gly (no R), Pro (cyclic) Ala Met Glu Leu Lys (Gln Phe) Val Ile Thr Trp Tyr Phe Pro Ser Asp Asn Gly Secondary Structure Preference Amino acids form chains, the sequence or primary structure. These chains fold in -helices, b-strands, b-turns, and loops (or for short, helix, strand, turn and loop), the secondary structure. These secondary structure elements fold further to make whole proteins, but more about that later. There are relations between the physico-chemical characteristics of the amino acids and their secondary structure preference. I.e., the b- branched residues (Ile, Thr, Val) like to sit in b-strands. β-branched prefers β-strand Secondary Structure Preferences Isoleucine Leucine Phenylalanine Threonine Tryptophan Tyrosine Valine helix 1.08 1.41 1.13 0.83 1.08 0.69 1.06 strand 1.60 1.30 1.38 1.19 1.37 1.47 1.70 turn 0.47 0.59 0.60 0.96 0.96 1.14 0.50 Subset of strand-lovers. These residues either have in common their b-branched nature (Ile, Thr, Val) or their large and hydrophobic character (rest). Secondary Structure Preference Most secondary structure elements are located at the surface of the protein: So, most helices have an inward pointing side, and an outward pointing side. Helix Helix Helix Helix Helix Helix Helix So, helices pack because of the hydrogen bonds and because of the hydrophobic packing of side chains along the length of the helix. Ceratin residues do this hydrophobic packing better than others, and those residues are thus good for a helix. Secondary Structure Preferences Alanine Glutamic Acid Glutamine Leucine Lysine Methionine Phenylalanine helix 1.42 1.39 1.11 1.41 1.14 1.45 1.13 strand 0.83 1.17 1.10 1.30 0.74 1.05 1.38 turn 0.66 0.74 0.98 0.59 1.01 0.60 0.60 Subset of helix-lovers. If we forget alanine (I don’t understand that things affair with the helix at all), they share the presence of a (hydrophobic) C-b, C-g and C-d (S-d in Met). These hydrophobic atoms pack on top of each other in the helix. That creates a hydrophobic effect. Secondary Structure Preferences Secondary Structure Preferences Aspartic Acid Asparagine Glycine Proline Serine helix 1.01 0.67 0.57 0.57 0.77 strand 0.54 0.89 0.75 0.55 0.75 turn 1.46 1.56 1.56 1.52 1.43 Subset of turn-lovers. Glycine is special because it is so flexible, so it can easily make the sharp turns and bends needed in a b-turn. Proline is special because it is so rigid; you could say that it is prebend for the b-turn. Aspartic acid, asparagine, and serine have in common that they have short side chains that can form hydrogen bonds with the own backbone. These hydrogen bonds compensate the energy loss caused by bending the chain into a b-turn. Secondary Structure Preferences Alanine Arginine Aspartic Acid Asparagine Cysteine Glutamic Acid Glutamine Glycine Histidine Isoleucine Leucine Lysine Methionine Phenylalanine Proline Serine Threonine Tryptophan Tyrosine Valine helix 1.42 0.98 1.01 0.67 0.70 1.39 1.11 0.57 1.00 1.08 1.41 1.14 1.45 1.13 0.57 0.77 0.83 1.08 0.69 1.06 strand 0.83 0.93 0.54 0.89 1.19 1.17 1.10 0.75 0.87 1.60 1.30 0.74 1.05 1.38 0.55 0.75 1.19 1.37 1.47 1.70 turn 0.66 0.95 1.46 1.56 1.19 0.74 0.98 1.56 0.95 0.47 0.59 1.01 0.60 0.60 1.52 1.43 0.96 0.96 1.14 0.50 Chou Fasman parameters Say your dataset is 1000 amino acids and 350 of them are in alpha-helix conformation. This is 35%. There are 50 Alanines in your set and 25 of them are in alpha-helix conformation. This is 50%. The helix preference parameter P for Ala is 50/35=1,43 Chou Fasman parameters Take home message: Preference parameter > 1.0 specific residue has a preference for the specific secondary structure. Preference parameter = 1.0 specific residue does not have a preference for, nor dislikes the specific secondary structure Preference parameter < 1.0 specific residue dislikes the specific secondary structure. Sequence Alignment Don’t forget that we still want to gather information about an unknown protein for which we determined the sequence. To gather that information, we will need databases and sequence alignments. To do these sequence alignments, we need to know everything about the amino acids. And that is what we are working on now.