The structure and function of fibrous proteins Ka-Lok Ng Asia University Introduction • Historically a division into globular and fibrous 纖維性 proteins – very different properties, different roles • Today it is preferable to avoid this distinction and to treat proteins as belonging to families that exhibit structural or sequential homology • Fibrous proteins – found to make up many of the fibres found in the body • Had a common role in conferring 賦予 strength and rigidity • Found in cells, connective tissues such as tendons (腱) or ligaments (韌帶) • These proteins tend to occur as rod-like structures • Next chapter will deal with the membrane proteins The amino acid composition and organization of fibrous proteins • Fibrous proteins - at least three different structural designs • (1) Coiled-coils of a helices – α keratins 角質素 • (2) extended β anti-parallel β sheets – such as silk fibroin 蠶絲蛋白 a collection of proteins made by spiders or silkworms • (3) a triple helical arrangement of polypeptide chains and shown by the collagen 膠原質 family of proteins The amino acid composition and organization of fibrous proteins • Amino acid composition of fibrous proteins reveals considerable differences in their constituent amino acids to that for globular proteins • Collagen – high proline content, > 20%, whilst silk fibroin < 1% • a keratin – cysteine content is 11.2% but in collagen and silk fibroin the level of cysteine ~ 0% • In each case, the aa composition influences the SS formed by fibrous proteins Keratins • Found as hair, feathers, scales, nails or hooves of animals • Properties – mechanically strong, unreactive • At least two major groups of keratins – α keratins are typically found in mammals whilst β keratins are found in birds and reptiles β keratins are analogous to the silk fibroin structures produced by spiders and silkworms α keratins are a subset of filamentous proteins based on coiled-coils called intermediate filament (IF) • http://scop.berkeley.edu/search.cgi (intermediate +filament, search PDB using MSDlite Æ view) • IF can be found as major components of cytoskeleton structures Keratins • The most common arrangement for keratin is a coiled-coil of two α helices • Three helices stranded arrangements – extracellular keratins domains • Four stranded coiled-coil – found in insects • Figure 4.1 – coiled-coil arrangement, sometimes called a superhelix • X-ray diffraction studies showed a periodicity of 1.5 A (regular α helix) and 5.1 A (pitch) repulsive Keratins A 500 interaction angle Superhelix is left-handed A 200 interaction angle http://bmbiris.bmb.uga.edu/wampler/8010/lectures/motifs/sld018.htm Keratins • The coiled-coil is formed by each helix interacting with the other and by burying their hydrophobic residues away from the solvent interface • The hydrophobic aa occur at regular intervals throughout the chain • Preference for charged side chains at positions within helices that are in contact with solvent Æ a repeating unit of 7 residues, called the heptad repeat Æ represented by a helical wheel (Figure 4.2) • The coiled-coil arrangement leads to a slight decrease in the number of residues/turn to 3.5 1, 4 position Hydrophobic Interaction, usually Leu, Ile and Ala The Heptad Repeat of The Coiled-coil Structure • Each strand of a coiled-coil protein may be viewed as a repeated consensus substrings of the form (a-b-c-d-e-f-g)n, where a, b, c,...,g are the seven different structural positions on the coil. • The first and fourth position (a and d) are generally apolar or hydrophobic amino acids. Discontinuities in the heptad pattern, such as stutters (disorder) are quite frequent. • The mapping of the amino acids in the strand to the seven positions is hence not always continuous. When the two strands coil around each other positions a and d are internalized, stabilizing the structure, while positions b, c, e, f, g are exposed on the surface of the protein. • Make a precise inter-digitating (指狀的) surface Æ a slightly differ pitch 5.1 A compared with 5.4 A in a regular α helix http://cis.poly.edu/~jps/coilcoil.html The Heptad Repeat of The Coiled-coil Structure • The interleaving of side chain protein – heptad repeat (1, 8, 15, 22 ..) • Leucine zipper arrangement • Coiled-coil arrangement not only enhances the stability of a single helix but also confers considerable mechanical strength Æ analogous to the interwinding of a rope • High content of cysteine Æ disulfide bridges that cross-link neighbouring coiled-coils to build up a filament Æ constituting hair or nail http://bmbiris.bmb.uga.edu/wampler/8010/lectures/motifs/sld014.htm The 7, 11 and 18 Repeat of The Coiled-coil Structure The 7, 11 and 18 Repeat of The Coiled-coil Structure Other types of repeat found in coiled-coils 1,4 heptad repeat 1 1 1 Keratins - The classification of intermediate filaments 星形神經膠質 間充質 Keratins • Higher order of a keratin structure • Each coiled-coil aligns in a head to tail arrangement Æ staggered 錯開 row to form a protofilament Æ protofilament dimerizes to form a protofibril Æ 4 protofibrils make a microfibril Æ macrofibril Keratins Leucine zipper DNA binding protein GCN4 • Coiled-coil motif occurs in many other proteins as a recognizable motif • It occurs in viral membrane-fusion protein including the gp41 domain found as part of the HIV/SIV • Haemagglutinin component of the influenza virus • TF – leucine zipper protein GCN4 • Muscle proteins, tropomyosin • Widely different folds of the coiledcoil (Figure 4.4) c-jun proto-oncogene dimerized with c-fos Two views of GP41 core domain of the SIV Keratins • • • • • • • Mutation in gene coding for keratin lead to severe consequences on individuals Æ the integrity (完全性) skin, cell adhesion, motility and proliferation (增殖) Keratins, the most abundant proteins in epithelial cells, are encoded by two groups of genes designated type I and type II. There are >20 type I and >15 type II keratin genes occurring in clusters at separate loci Type II keratin proteins – soft epithelia (K1 – K8), hard epithelia (such as hair, nail and parts of the tongue epithelium) are designated Hb1 – Hb8) Æ 8 + 8 = 16 types Type I keratins – soft epithelia (K9 – K20) and hard epithelia (Ha1 – Ha10) Æ 12 + 10 = 22 types Hard and soft – refer to the sulfur content of keratins A high cysteine refers to the sulfur content of keratins Æ hard keratins Keratins • The distribution of some of the keratins – Table 4.3 • Type II keratins are the basic or neutral courterparts to the acidic type I keratins. Each type II keratin forms a heterodimer with a specific acidic keratin, and the heterodimers are organised into tetramers and then into chains • Type I and type II keratin genes are in fact regulated in a pairwise, epithelial tissue type-, and differentiation-specific manner, giving rise to "patterns" that have been and continue to be useful to study epithelial growth and differentiation. • K10 – K 1 Æ suprabasal epidermal keratinocyte上基底層角質化細胞 • K12 – K3 Æ cornea of eye 角膜 • K13 – K4 Æ epithelial layer, such as oral cavity, tongue and the oesophagus食道 • K14 – K5 Æ basal layer keratinocytes 基底層 Keratins • Disulfide bridges between cysteine Æ reduction with reducing agents (mercaptans) • Hair Æ reduction Æ from a curled state to a straightened form • Removal of the reducing agent Æ oxidation of the thiol groups Æ formation of disulfide bridges Æ reformed in a new curled conformation • Reduction of disulfide bridges in hair Æ keratin stretch to over twice their original length Æ β keratin (found in feathers, silk-like sheets of fibroin) Fibroin • Silk fibroin class made up of an extended array of β strands assembled into a β sheet • Insect and spider produce a variety of silks to assist in the production of webs, cocoons 繭, and nests. • Silk consists of a collection of antiparallel β strands leads to a microcrystalline array of fibres in a highly ordered structure. • The silk filament is composed of a thread-like protein called fibroin, which is bundled and arranged lengthwise to form a continuous fiber. Fibroin is composed of two subunits, Fibroin Heavy Chain and Fibroin Light Chain, which are held together covalently (via disulfide bonds). Fibroin • Silk fibroin has a long stretches of repeating composition. A six residue repeat of (Gly-SerGly-Ala-Gly-Ala)n is observed to occur frequently – this motif lacks large aa • These three aa appear to represent over 85% of the total aa composition with ~45% Gly, 30% Ala, and 15% Ser in silk fibroin • The 6 aa motif is part of a larger repeating unit that may be repeated up to 50 times Fibroin • The interaction between Gly surfaces yields an inter-sheet spacing of 3.5 A • The interaction of the Ser/Ala side chains gives a spacing of 5.7 A anti-// b strands interaction of Ala in an end on view Ala/Ser interface Gly interface Fibroin • Silk has many remarkable properties • Weight-for-weight it is stronger than metal alloys such as steel, it is more resilient 有彈力 than synthetic polymers such as Dupont’s Kelvar, yet is finer than a human hair • Many attempts to mimic the properties of silk – biomimetic 仿生 chemistry • Extremely strong – the fully extended conformation β strands are any further extension would require the breakage of strong covalent bonds • Flexibility – a result of vdw interactions that exist between the anti-// b strands Spider silk proteins Spider silk proteins • http://biology.ucr.edu/people/faculty/ Hayashi.html • Each individual spider produces as many as six or seven distinct varieties of silk from a battery of specialized glands. The different silks serve different purposes, ranging from web construction and prey capture to courtship and nest-building. • Spider dragline silk have unmatched characteristics of strength and elasticity. It has a tensile strength [200,000 psi (1 psi = 6.9 kPa), greater than steel and of the same order of magnitude as Dupont’s Kevlar. These silk fibers also have an elasticity of up to 35%. • http://en.wikipedia.org/wiki/Spider_si lk Structure of spider silk. Inside a typical fiber, one finds crystalline regions separated by amorphous linkages. The crystals are β-sheets that have assembled together. Collagen • Collagen – a major component of skin, tendons 腱, ligaments 韌帶, teeth and bones • Collagen – a major component of connective tissue, the most abundant tissue in vertebrates (also found in C.elegan) • at least 30 distinct types of collagen have been identified from the respective genes with each showing differences in aa sequences • Collagens have the structure of a triple helix • In humans at least 19 different collagens are assembled Æ 4 major classes Æ Type I (2α1(I) chains, one α2), Type II (3α1 chains) • In a mature adult collagen fibres are extremely robust and insoluble • younger animals contained a greater proportion of collagen with higher solubility (because the extensive crosslinking of collagens is lacking) Collagen The structure and function of collagen • Fundamental structural unit – tropocollagen • 3 tropocollagen polypeptide chains (α chains, lefthanded helix) are supercoiled about a common axis to form a right-handed superhelical structure • ~1000 aa, ~300 nm in length, ~1.4 nm in diameter • The length of collagen ~ 100 times of myoglobin, its diameter is only half that of myoglobin • Tropocollagen are unusual in their aa composition – high proportions of glycine residues and proline • Collagen has a repetitive primary sequence in which every third residue is glycine • -Gly-Xaa-Yaa-Gly-Xaa-Yaa-Gly-Xaa-Yaa• Xaa and Yaa are often found to be proline or lysine • Many of the proline and lysine are hydroxylated via PTM to yield HyP or Hyl Æ Gly-Pro-Hyp occurs frequently in collagen Collagen • Each polypeptide chain intertwines with the remaining two chains to form a triple helix Æ Figure 4.6 • Each chain has the sequence (Gly-X-Y)n and forms a LH superhelix with the other two chains Æ supercoiled in a RH manner about a common axis ≠ α-helix structure • Translation distance per residue for each chain in the triple helix is 0.286 nm whilst the number of residues per turn is 3.3 (3.6 in regular α-helix, p.43) Î 0.286 x 3.3 ~ 0.95 nm for the helix pitch Collagen • As a result of the presence of both glycine and proline in high frequency in the collagen sequences the triple helix is forced to adopt a different strategy in packing polypeptide chains. • The close packing of chains stabilizes the triple helix through vdw interactions + extensive H-bonding between polypeptide chains (NH group of one Gly --- C=O group of residue X on adjacent chains) – direction of H-bonding are transverse the axis of the helix 1 2 3 Y – Gly – X Gly – X – Y occur at ~ the same level X – Y – Gly Proline and HyP are shown in yellow Collagen Melting temperature of the collagen triple helix - The temperature at which half the helical structure has been lost - A sharp transition at a certain temperature reflecting the loss of ordered structure Î a progressive loss of function - (Gly-Pro-Pro)n Tm ~ 24℃ - (Gly-Pro-HyP)n Tm ~ 60 ℃ Sharp transition Collagen • • • Further strength arises from the association of tropocollagen molecules together as part of a collagen fibre Each tropocollagen molecule packs together with neighbouring molecules to produce a characteristic banded appearance of fibres in electron micrographs Intramolecular and intermolecular cross-links – result of covalent bond formation Collagen Collagen biosynthesis • Collagen (synthesized at the ribosome) Æ PTM Æ mature collagen which is very different from the initial collagen • Any process interfering with modification of collagen tends to result in severe forms of disease • The biosynthesis of collagen is divided into discrete reactions that differ not only in the nature of the modification but their cellular location Collagen Collagen • • • • • • initial formation of preprocollagen, the initial translation product formed at the ribosome. The collagen precursor contains a signal sequence that directs the protein to the endoplasmic reticulum membrane the polypeptide is hydroxylated resulting in the formation of hydroxyproline (Hyp) and hydroxylysine (Hyl) glycosylation of the collagen precursor and the attachment of sugars, chiefly glucose and galactose, occurs via the hydroxyl group of Hyl Assembly of three α-chains to form procollagen, and it is transported to the Golgi system prior to secretion from the cell procollagen peptidases remove the disulfide-rich N and C terminal extensions leaving the triple helical collagen in the extracellular matrix (Figure 4.11) A better graphical description http://staff.um.edu.mt/acus1/Extracellular.htm Collagen • the triple helical collagen in the extracellular matrix can then associate with other collagen molecules to form staggered 錯開, parallel arrays (Figure 4.12) Disease states associated with collagen defects • Involvement of collagen – tendons, ligaments, skin, blood vessels • Mutation in collagen genes often result in severe disorders affecting many organ systems • Defects in the enzymes responsible for the assembly and maturation of collagen creating a further group of disease states. • Defects in collagen genes lead to osteogenesis (the formation or growth of bone) imperfecta, hereditary osteoporosis 骨質疏鬆症, and familial aortic aneurysm 家族性大動脈瘤 • the most common mutation – substitution of Gly by a different aa Æ destroying the characteristic repeating sequence of Gly – X – Y Æ consequence of these mutation is the incorrect folding of collagen Æ serious diseases Æ Osteogenesis imperfecta and Ehlers – Danlos syndrome (EDS)先天結締組織異常 Disease states associated with collagen defects • Osteogenesis imperfecta is a genetic disorder characterized by bones that break comparatively easily often without obvious cause – also called brittle bone disease. At least four different types of osteogenesis imperfecta are recognized (~ 1 in 20000, Table 4.5 – ranging in severity from lethal to mild) • Osteogenesis imperfecta is caused by a mutation in one allele of either the α1 or α2 chains of the major collagen in bone, type I collagen. • Type I collagen contains two α1 chains + α2 chain Disease states associated with collagen defects Ehlers – Danlos syndrome • Caused by mutations in a single collagen gene • Variable phenotype – some relatively benign 良性 whilst others are life threatening • Joint hypermobility, skin extensibility, vascular fragility • Mutations can lead to changes in the levels of collagen molecules, changes in the cross-linking of fibres, a decreased hydroxylysine content and a failure to process collagen correctly by removal of the N-terminal regions • The common effect of all mutations is to create a structural weakness in connective tissue as a result of a molecular defect in collagen http://staff.um.edu.mt/acus1/Extracellular.htm Related disorders characterized at a molecular level Marfan syndrome馬凡氏症候群 • http://www.tfrd.org.tw/rare/typeCont.php?sno=1101&kind _id=11 • is an inherited disorder of connective tissue affecting multiple organ systems including the skeleton, lungs, eyes, heart and blood vessels. The defect is now known to reside in a related protein that forms part of the microfibrils making up the extracellular matrix that includes collagen. • Marfan syndrome is caused by a molecular defect in the gene coding for fibrillin, an extracellular protein found in connective tissue, where it is an integral component of extended fibrils. Related disorders characterized at a molecular level • Humans have two highly homologous fibrillins, fibrillin-1 (Figure 4.13) and fibrillin-2, mapping to chromosomes 15 and 5 respectively. • A characteristic feature of both fibrillins is their mosaic (拼圖, 鑲嵌) composition where numerous small modules combine to produce the complete, very large, protein of 350 kDa. • The majority of fibrillin consists of epidermal growth factor-like subunits (47 EGF-like modules) of which 43 have a consensus sequence for calcium binding. • Each of these domains is characterized by 6 cysteine aa three S-S bridges and a calcium binding sequence of D/N – x – D/N – E/Q – xm – D/N* - xn – Y/F (where m and n are variable and * indicates possible PTM by hydorxylation). • Other modules : TB domain, hybrid domain Related disorders characterized at a molecular level • The EGF domain occurs in many other proteins including blood coagulation proteins such as factors VII, IX, X, and the low density lipoprotein receptors. • Structure of EGF like domain and a pair of calcium-binding domains confirmed a rigid rodlike arrangement stabilized by calcium binding and hydrophobic interactions (Figure 4.14). • Mutations known to result in Marfan’s syndrome lead to decreased calcium binding to fibrillin Æ play an important physiological role • A single aa substitution Æ disrupts the structural organization of individual EGF-like motifs • Victims – Flo Hyman, a famous American Olympian volleyball player, Abraham Lincoln (speculated on the basis of their physical appearance), president of USA, and the virtuoso 鑒賞家 violinist Niccolo Paganini Summary 1. 2. 3. 4. 5. 6. 7. 8. Fibrous proteins represent a contrast to the normal topology of globular domains. Fibrous proteins lack true tertiary structure showing elongated structures and interactions confined to those between local residues. The amino acid composition of fibrous proteins differs considerably from globular proteins but also varies widely within this group. Variation in composition reflects the different roles performed by each group of fibrous proteins. Three prominent groups of fibrous proteins are collagens, silk fibroin and keratins - all occupy pivotal roles within cells. Collagen is very abundant in vertebrates and invertebrates A triple helix provides a platform for a wide range of structural roles in the extracellular matrix delivering strength and rigidity to a wide range of tissues. The triple helix is a repetitive structure containing the motif (GlyXaa-Yaa) in high frequency with Xaa and Yaa often found as proline and lysine residues. Summary 9. 10. 11. 12. 13. 14. 15. 16. 17. Repeating sequences of amino acids are a feature of many fibrous proteins and help to establish the topology of each protein. In collagen the presence of glycine at every third residue is critical because its small side chain allows it to fit precisely into a region that forms from the close contact of three polypeptide chains. Although based around a helical design the helix differs considerably in dimensions to the typical α helix. The triple helix of collagen undergoes PTM to increase strength and rigidity. Keratins make up hair and nails and contain polypeptide chains arranged in helical conformations. The helices interact by supercoiling to form “coiled-coils”. Specific non-polar interactions between residues in different helices confer stability with the basis of this interaction being a heptad of repeating residues along the primary sequence. A heptad repeat possesses leucine or other residues with hydrophobic side chains arranged periodically to favour inter-helix interactions. In view of their widespread distribution in all animal cells mutations in fibrous proteins such as keratins or collagens leads to serious medical conditions. Many disease states arise from inherited disorders that lead to impaired structural integrity in these groups of proteins.