Abstract Dipeptidyl peptidase IV is a serine oligopeptidase utilised by the periodontopathic, asaccharolytic bacterium Porphyromonas gingivalis in amino acid degradation & peptide scavenging, which has been attributed to the pathogenesis of adult periodontitis. The aims of this project were to characterise the three-dimensional structure of dipeptidyl peptidase IV and its relation to the enzymes function, to aid in the design of future inhibitors for the treatment of periodontitis. Preliminary x-ray crystallography of native dipeptidyl peptidase IV crystals provided a 2.5 Å resolution radiation diffraction dataset, and x-ray diffraction on Selenomethionine-derived crystals provided an additional 3-wavelength multi-wavelength anomalous dispersion dataset for resolution of crystallographic phases. Prior to the start of the project an atomic model was autonomously built using the Solve/Resolve software. WinCoot and CCP4 program packages were then used for manual and automated refinement of the constructed model. Following refinement, the most refined model had an R-factor of 21.3%, with a corresponding Rfree value of 26.9%. The average B factor of total protein atoms for this model was 49.9 Å2, and the average RMS deviation of bond distances and angles were 0.0156 Å and 1.9147° respectively. Structural alignment with available eukaryotic and prokaryotic homologs revealed that dipeptidyl peptidase IV bore strong structural congruity; in particular dipeptidyl peptidase IV from the gram-negative bacterium Stenotrophomonas maltophilia showed the greatest degree of structural alignment. The active site of dipeptidyl peptidase IV however showed greater alignment with its eukaryotic homologues; dipeptidyl peptidase IV from both Homo sapien and Sus scrofa, suggesting that this prokaryotic enzyme has a greater structure-function relationship homology with eukaryotic homologs than prokaryotic homologs. Alexander Fullwood - 1024400 Crystal structure determination of dipeptidyl peptidase IV from Porphyromonas gingivalis Supervisor: Prof. Vilmos Fülöp 2013 1 Table of Contents 1.1 - Periodontal disease & Porphyromonas gingivalis .............................................................................................3 1.2 - Dipeptidyl peptidase IV: the prolyl oligopeptidase family ................................................................................4 1.3 - Treatment of Periodontal disease: inhibition of dipeptidyl peptidase IV .........................................................6 1.4 - Homologs of P. gingivalis dipeptidyl peptidase IV ............................................................................................9 1.4.1 - Homo sapien dipeptidyl peptidase IV ........................................................................................................9 1.4.2 - Sus scrofa dipeptidyl peptidase IV ...........................................................................................................15 1.4.3 - Stenotrophomonas maltophilia dipeptidyl peptidase IV .........................................................................17 1.4.4 - P. gingivalis prolyl tripeptidyl peptidase..................................................................................................18 1.5 - Structural Biology: X-ray Diffraction/MAD ......................................................................................................20 1.6 - Aims .................................................................................................................................................................21 2.1 - Expression, purification & crystallisation ........................................................................................................21 2.2 - X-Ray Diffraction .............................................................................................................................................22 2.3 - Data Processing ...............................................................................................................................................23 2.3.1 - Solve/Resolve ...........................................................................................................................................23 2.3.2 - WinCoot ...................................................................................................................................................23 2.3.3 - CPP4 .........................................................................................................................................................23 7.1 - Protein Sequence materials ............................................................................................................................48 7.1.1 - PgDPPIV sequence ...................................................................................................................................48 7.1.2 - HsDPPIV sequence ...................................................................................................................................48 7.1.3 - SsDPPIV sequence ....................................................................................................................................48 7.1.4 - SmDPPIV sequence ..................................................................................................................................49 7.1.5 - PgPTP sequence .......................................................................................................................................49 7.1.6 - CLUSTAL W2 multiple sequence alignment .............................................................................................49 7.1.7 - ClustalW2 Sequence alignment Scores ....................................................................................................51 2 1 - Introduction 1.1 - Periodontal disease & Porphyromonas gingivalis Adult periodontitis is an inflammatory disease of the periodontium, which maintains and supports the teeth in the oral cavity. Continual destruction of the periodontium results in the eventual loss of the connective tissue between the teeth and the gum. Adult periodontitis occurs as a result of the presence of oral periodontopathogenic anaerobes, the most common of which are Porphyromonas gingivalis and Tannerella forsythia, which can be observed in afflicted patients at frequencies of 85.7% and 60.7% respectively(1). These 2 pathogens Figure 1: 0.1% uranyl acetate stained electron micrograph of P. gingivalis strain ATCC 33277. along with Treponema denticola form the so called red complex of periodontopathogens. P. gingivalis is a gram-negative asaccharolytic black-pigmented asaccharolytic bacterium (figure 1(2)) which metabolises oligopeptides as an alternative carbon/energy source to glucose and other carbohydrates(3). It invades and replaces the facultative gram-positive bacteria in the host periodontium by formation of a complex subgingival biofilm at the tooth's surface, referred to as a plaque, where its presence often results in the destruction of the gingival connective tissues; including the alveolar bone(4) and the periodontal ligament(5, 6). In addition to periodontal disease, P. gingivalis has also been implicated in rheumatoid arthritis(7, 8), closely-associated peri-implantitis(9, 10), premature labour(11) and atherosclerosis(12). High proteolytic activity, as well as the release of inflammatory mediators by host leukocytes (proinflammatory cytokines, matrix metalloproteinases and prostanoids), are both key factors in the development of periodontal diseases. The high proteolytic activity in the oral cavity occurs as a result of a large battery of proteases & peptidases utilised by P. gingivalis, including enzymes with trypsin-like, collagenolytic & glycylprolyl peptidase activity. An example of a well characterised family of enzymes are the gingipains; Lys/Arg-specific Cysteine proteinases that are attributed to 85% of the total extracellular activity(13) that contributes to much of P. gingivalis’ pathogenicity, including the evasion of the host immune system, establishment of chronic inflammation, cellular adhesion and vascular permeability & bleeding(3). 3 1.2 - Dipeptidyl peptidase IV: the prolyl oligopeptidase family Dipeptidyl peptidase IV from P. gingivalis (PgDPPIV) is another enzyme that in mouse knock-out experiments was implicated to have a role in the virulence of P. gingivalis(14). PgDPPIV was first identified as an enzyme having only a role in peptide scavenging and amino acid degradation(15), however current research also implicates PgDPPIV in number of proteolytic activities on peptides(16), as well as being a major participant in the asaccharolytic growth of P. gingivalis(17). PgDPPIV is a 723 amino acid residue integral membrane protein located in the outer periplasmic membrane, where it is exposed to the extracellular environment. PgDPPIV belongs to the S9B subfamily of prolyl oligopeptidases (POPs), which hydrolyse the C-terminal side of peptide bonds of linear oligopeptides (up to ~30 residues) containing the amino acid residue Proline at their P1 sites. The POP family constitutes a great number of enzymes, such as prolyl oligopeptidase (S9A), oligopeptidase B (S9A), dipeptidyl aminopeptidase X (S9B), acylaminoacyl peptidase (S9C) and prolyl tripeptidyl peptidase (S9B). POP family enzymes are unrelated to the classical trypsin and subtilisin family of peptidases, however the mechanism of binding & catalysis maintain a degree of homology; catalysis is driven by a catalytic triad consisting of the residues Serine (nucleophile), Aspartate and Histidine (acid/base), which are housed in a highly conserved α/β hydrolase domain which consists of 8 β-sheets connected by α-helices. All POP family proteins share a catalytic serine consensus sequence, GxSxGGφφ, (figure 2(18)) where x is any residue and φ is a hydrophobic residue. The DPPIV subfamily enzyme motif is generally comprised of GWSYGGφφ, however there are DPPIV enzymes which are exceptions. The mechanism of catalysis for these serine peptidases involves the formation of a negatively charged covalent acyl transition state intermediate, which is stabilised by 2 hydrogen bonds with residues in a nearby oxyanion binding pocket. In chymotrypsin family enzymes these hydrogen donors are provided by the backbone NH groups of the catalytic Serine and nearby Glycine, and in subtilisin family enzymes they are provided by the backbone NH of the catalytic Serine and the side-chain NH2 amide of Asparagine. In POP family enzymes these hydrogen donors are the backbone NH of the 2nd x residue in the POP Serine consensus sequence, and the OH hydroxyl group of an upstream Tyrosine residue. The Tyrosine residue present at the 2nd x position provides the hydrogen donor for oxyanion stability in DPPIV family enzymes. 4 Figure 2: Partial protein sequence alignment of the c-terminal end of prolyl oligopeptidase (POP, subfamily S9A) and other S9 family enzymes : rat liver acylaminoacyl-peptidase (ACPH, S9C subfamily), human protein 3p2l (later acylaminoacyl-peptidase, S9C subfamily), rat liver dipeptidyl peptidase IV (DPPIV, S9B subfamily) and yeast dipeptidyl peptidase B (DAPB, S9B subfamily). Highlighted in the red box is the serine consensus sequence GXSXGG present in all 5 sequences. In addition to the α/β hydrolase domain, the POP family enzymes also contain a second domain, a βpropeller, which in POP family enzymes acts as a gating mechanism by which substrate size and stereochemistry may be selected for to provide substrate specificity. The β-propeller of the POP family is a primarily β-sheet structure, which consists of 7 blades arranging in a ellipsoid radial open-velcro topology to form a solvent-exposed opening, with each blade consisting of a stack comprised of a varying number of anti-parallel β-sheets; typically 4. DPPIV enzymes differ slightly in their β-propeller topology, containing an additional 8th blade, with the 4th blade in the domain being comprised of 7 βsheets; 5 β-sheets in the primary blade and 2 β-sheets in the subdomain, the role of which is elucidated in section 1.4. PgDPPIV maintains specificity for both X-Pro and X-Ala residues, which are cleaved at the N-terminus of oligopeptide substrates. This has been evidenced by the cleavage of glycylprolyl dipeptides from type-1 collagen partially treated with collagenase from Clostridium histolyticum(15). Type-1 collagen is a helical, polymeric protein (consisting of repeating tripeptide monomers of Gly-Pro-X) that intertwines with other collagen molecules to form a super helical homotrimeric quaternary structure. X-Pro and Pro-X cleavage sites are not apparent in other proteases & peptidases, thus providing a niche of activity for the DPPIV family enzymes. DPPIVs are also shown to act upon hydroxylated Prolines, which are a posttranslational modifications that increase the stereoelectronic stability of the collagen triple helix in eukaryotic organisms (X-HyPro-Gly)(19). More recent evidence however suggests that while PgDPPIV can hydrolyse type-1 collagen, it does not directly hydrolyse homotrimeric type-1 collagen in vivo, but also promotes the activity of host-derived matrix metalloproteinase 2 (MMP-2) (gelatinase) and MMP-1 (collagenase) to aid in type-1 collagen hydrolysis, as well as also having a role in the mediation of the adhesion of P. gingivalis fimbrae to fibronectin(20). 5 1.3 - Treatment of Periodontal disease: inhibition of dipeptidyl peptidase IV The observed role of DPPIV in the pathogenicity of P. gingivalis provides a potential drug target for the treatment of adult periodontitis and other implicated diseases. However there is very little current published data on the inhibition of PgDPPIV. Current research into the treatment of periodontal disease have focused in other areas of virulence such as the reliance on communication between peridontopathogens, particularly the symbiotic relationships between P. gingivalis and the red complex species, and in particular Aggregatibacter actinomycetemcomitans(21). There is also much research describing the importance of fimbrae in plaque formation(22). Targeting P. gingivalis with currently available antibiotics has thus-far proven difficult, as both encapsulated and non-encapsulated P. gingivalis is capable of invading host gingival fibroblasts, making internalised P. gingivalis resistant to current antibiotics(23). Currently vaccination in combination with passive immunisation & probiotic therapy is the favoured course of progression against treatments of periodontal disease(24). Mus musculus monoclonal antibodies in particular have been put forward for use potential in passive immunisation. One such antibody, MAb-Pg-DAP-1, was produced by Teshirogi et al in 2003(25) using highly purified PgDPPIV as an immunogen. The developed MAb was also capable of inhibition of DPPIV in gram-negative species such as Porphyromonas endodontalis and Prevotella loesheii, but was not able to inhibit DPPIV present in the gram-positive species Streptococcus mutans and Actinomyces viscosus. HsDPPIV present in blood serum was also not inhibited. Only 2 papers to date been published on direct PgDPPIV inhibition. Gilmore et al published a paper in 2006 describing the kinetic effects of the binding of a biotinylated dipeptide proline diphenyl phosphonate inhibitor, H2N-Glu(biotinyl-PEG)- ProP(OPh)2 (figure 3), on DPPIV-like serine peptidases(26). For porcine homolog (Sus scrofa DPPIV, SsDPPIV) it was shown to be an irreversible inhibitor with a second-order rate constant (ki/Ki) for inhibition of 1.57 x 103 M-1 min-1; favourable in comparison to other P1 proline diphenyl phosphonate dipeptide inhibitors. Figure Studies carried out on P. gingivalis strain W83 did not elucidate biotinylated dipeptide proline diphenyl the kinetics of inhibition of PgDPPIV, however the inhibitor did PEG)- ProP(OPh)2. 3: Chemical structure of the phosphonate inhibitor, H2N-Glu(biotinyl- prove to be an efficient probe for the presence of DPPIVs in crude 6 sonicates of W83 using western blot analysis of the 80kDa protein; this was to their knowledge the first DPPIV active site to be labelled in this manner, and may have applications beyond the treatment of periodontal disease, into other serine-protease directed diseases such as diabetes, mitochondrial disease and rheumatoid arthritis. The other paper also published in 2006 by Bodet et al describes the effects of a non-dialysable material (NDM) extract from cranberry juice on the proteolytic activities of red complex bacteria, P. gingivalis, T. forsythia and T. denticola (27) . They characterised the hydrolytic activity of PgDPPIV on synthetic chromogenic peptides, as well as the total P. gingivalis proteolytic activity on Type-1 collagen and transferrin. Their results showed that this NDM reduced the activity of PgDPPIV by 50% at a concentration of 150 µg mL-1 (figure 4), however compared to other enzymes such as Arg-gingipain and Lys-gingipain the inhibition of DPPIV is much less, with Arg-gingipain and Lys-gingipain being inhibited at -1 concentrations of 75 µg mL and 25 µg mL-1 respectively. Combined with 30% inhibition of total collagenase activity at 50 µg mL-1 and Figure 4: Effect of the cranberry NDM fraction on DPPIV activity of P. gingivalis. The degradation obtained in the absence of NDM was given 95% total transferrin degradation activity at a value of 100%. *P < 0.05 between NDM of various concentrations 150 µg mL-1, there is still an implication that and control without NDM. this NDM may effective in the treatment of adult periodontitis. It is evident that thus far screening techniques have not elucidated any potential inhibitors of PgDPPIV. Direct structure-based drug design however may identify potential inhibitors. This technique is utilised to develop novel inhibitors tailored specifically for their target, ensuring specificity in the selectivity of the target, and has proven to be much a more efficient and cost-effective development process compared to high-throughput screening. However to design targets in such a way, one first needs to have an understanding of the structural features that can be attributed to the function of the target protein; the so called structure-function relationship. For enzymes such as PgDPPIV, this would require an understanding of the structural events occurring at the active site, particularly of the residues making up the binding pockets of the natural substrates. To ascertain this knowledge, biophysical techniques are commonly used to determine the 3-dimensional structure proteins. Of the structures currently 7 available in the Protein Data Bank (PDB), 88.2% were solved using X-ray crystallography, while 10.9% were solved using nuclear magnetic resonance (NMR). If there is sufficient similarity between structures, homologs of the protein in question can be used to aid in the design of such structure-specific inhibitors. The inhibition of Homo sapien DPPIV (HsDPPIV) is well characterised in the treatment of human diseases; DPPIV-inhibitors, also referred to as gliptins (figure 5), are currently used in the treatment of type II diabetes mellitus. Inhibitors of DPPIVs have been long sought after since the late 80s and early 90s, however the first clinically approved gliptin, sitagliptin (previously MK-0431, figure 5A(28)) was only FDA approved in 2006. Mechanistically it is a competitive inhibitor of the gastrointestinal incretins glucagon-like peptide-1 and gastric inhibitory polypeptide, which are released in response to meal ingestion in the gastrointestinal tract. This inhibition results in the increase of extracellular insulin and a decrease in the production of glucagon. Currently there are a 7 gliptins available on the market, with the most recent, alogliptin (figure 5G(29)), being FDA approved in 2013. The structure-activity relationship of HsDPPIV and inhibitors such as vildagliptin (figure 5B), are discussed in section 1.4.1. Since their inception, direct drug design analysis has been utilised to optimise inhibitor selection of the HsDPPIV(30). Figure 5: Chemical structures of 7 gliptins either approved or currently in development/clinical trials. These are (in chronological order of published research): A) sitagliptin, B) vildagliptin, C) saxagliptin, D) linagliptin, E) dutogliptin, F) gemigliptin and G) alogliptin. 8 1.4 - Homologs of P. gingivalis dipeptidyl peptidase IV To date, no structure of the PgDPPIV has yet appeared in the literature. However a combination of sequence alignment (based on Clustal W2 alignment, see 7.1, appendix) with reading of the currently available literature has identified potential homolog candidates of PgDPPIV; HsDPPIV, SsDPPIV, Stenotrophomonas maltophilia DPPIV (SmDPPIV) and P. gingivalis prolyl tripeptidyl peptidase (PgPTP). Prior attempts at crystallographic study of PgDPPIV have had no success using molecular replacement the structure of PgDPPIV using such homologs(31); however structural techniques to resolve comparisons in combination with the interpretation of conserved residues may elucidate whether the structure-activity relationship of PgDPPIV is conserved, and perhaps aid us in the pursuit of new drug inhibitors for PgDPPIV. 1.4.1 - Homo sapien dipeptidyl peptidase IV A HsDPPIV (shown in figure 6) shares 25.45% of its sequence with PgDPPIV and is the most extensively studied homolog of PgDPPIV, particularly in the study of DPPIV family enzyme inhibitors. HsDPPIV has a role in amino acid degradation and peptide scavenging (32), however it is also implicated in homeostasis(33), glucose T-cell B signalling(34), chemotaxis(35) and cancerogenesis(36). Confirmed protein interactions include interactions (37) deaminase , with fibronectin (38) adenosine , collagen(39), HIV gp120 protein(40), chemokine receptor CXCR4(41) and tyrosine phosphatase CD45(42). Diseases which have been associated with Figure 6: Cartoon representation of the dimeric configuration of the action HsDPPIV include type 2 diabetes HsDPPIV from an A) front perspective and B) top-down perspective. mellitus(43), obesity(44), tumour growth(45), (46) and HIV infection Rendering was accomplished with Pymol, using crystal structure 1nu8 from the PDB. . 9 The structure of HsDPPIV used in this report was that of HsDPPIV crystallised with the ligand diprotin A (Ile-Pro-Ile), solved at 2.5 Å resolution by Thoma et al(47), with more current information sourced from the 2006 review of POP family peptidases by Rea D & Fülöp V(48). HsDPPIV is a single polypeptide composed of 766 amino acid residues, with a polypeptide pair associating as homodimer (figure 6), the formation of which has been linked with the regulation of catalytic activity(49); however dimerisation is not required to achieve catalytic activity(50). The 2 conserved domains, the α/β hydrolase domain at the C-terminus and the β-propeller domain at the N-terminus, are shown in figures 7A and figure 7B respectively. Each monomer contains 9 N-glycosylated Asparagine residues, and 5 disulphide bridges; 4 stabilising the β -propeller domain and 1 housed in the α/β hydrolase domain. A B Figure 7: Cartoon representations of A) the α/β hydrolase domain of HsDPPIV, showing the 8 β-sheet topology of the domain and B) the βpropeller domain of HsDPPIV, showing the 8 bladed topology and pore of the domain. Rendering was accomplished with Pymol, using crystal structure 1nu8 from the PDB. The active site sits in the interface between the α/β hydrolase domain (Gln508-Pro766) & the βpropeller domain (Arg54-Asn497), and has 2 solvent-exposed entrances; a side entrance (Figure 8A) and a solvent-exposed entrance through the β-propeller (Figure 8B); however the mode of entry still remains controversial. The solvent-exposed opening to the active site through the β-propeller has a length of 14 Å and a width of 7 Å. These tunnel dimensions imply that the tunnel allows for passage of an extended peptide or a hairpin loop, but not for a folded α-helix. The blades of the β-propeller are divided into 2 subdomains; blades II-V and blades I & VI-VIII. Bending of blade I and the subtle bending of blades II–IV results in the formation of the slightly larger side entrance. Both entrances are large enough to negate 10 the need for conformational changes. The membrane binding domain, found only in the integral membrane form of DPPIV, is located at the N-terminus. It comprises a short cytoplasmic tail (Met1-Arg6) and a single transmembrane helix (Val7-Leu28). This domain does not simply anchor the protein, but also contributes to the formation of the quaternary structure(51). Figure 9 shows the motifs involved in dimerisation of HsDPPIV. The subdomain which extends from blade IV of the β-propeller domain forms an anti-parallel β-sheet structure that forms the primary dimerisation motif for HsDPPIV, which are involved with the formation of a salt bridge that is formed with the blade IV subdomain of the partner monomer. Residues Phe713–Thr736 from the α/β comprise a loop that also plays a role in the dimerisation of HsDPPIV. In the monomeric form these residues are exposed, resulting in distortion of the catalytic triad(52). A hypothesis suggested by Rasmussen et al theorises that this loop forms a lid-type structure that covers the side opening in the monomeric form(53). Residues Arg658–Tyr661 form a short α-helix which also contributes to the dimer interface. A B Figure 8: Cartoon representation of HsDPPIV, showing A) the solvent-exposed side entrance and B) the solvent-exposed β-propeller entrance to the active site. The α/β hydrolase domain is coloured in orange & the β-propeller domain is coloured in blue. The substrate shown is diprotin A (Ile-Pro-Ile) in a tetrahedral intermediate complex with Ser630. Rendering was accomplished with Pymol, using crystal structure 1nu8 from the PDB. 11 A B Figure 9: Carton representation of HsDPPIV homodimer with highlighted residues involved in dimerisation: Residues Arg658–Tyr661 (red) and Phe713–Thr736 (blue) from the α/β hydrolase domain and the anti-parallel β-sheet stack subdomain (yellow) protruding from the 2nd β-sheet of blade 4 in the β-propeller domain. Figure 9A is from a front perspective, and figure 9B is a top-down perspective. Rendering was accomplished with Pymol, using crystal structure 1nu8 from the PDB. 12 Figure 10: Stick/cartoon diagram of the HsDPPIV active site. Shown are the substrate diprotin A (Ile-Pro-Ile) inhibitor in a tetrahedral intermediate complex with Ser630 (white), S1 hydrophobic residues Val711, Val656, Tyr662, Tyr666 & Trp659 (magenta), oxyanion stabilising residues Tyr547 & Tyr631 (green), catalytic triad Ser630, Asp708 & His740 (orange) and Glu205-Glu206 contributed from the βpropeller domain (cyan). Rendering was accomplished with Pymol, using crystal structure 1nu8 from the PDB. Figure 10 shows the active site of HsDPPIV with ligand Ile-Pro-Ile covalently bound. The residues that form that active site are contributed mostly by the α/β hydrolase domain. The catalytic triad is comprised of residues Ser630, Asp708 & His740. In the state shown above, His740 is protonated, with the cationic charge being stabilised by hydrogen bonding with the side chain of Asp708. The residues responsible for the stabilisation of the negatively charged oxyanion group of the acyl tetrahedral intermediate formed during catalysis are Tyr547 and Tyr631. The S1 binding site consists of the hydrophobic residues Val711, Val656, Tyr662, Tyr666, Trp659 and Tyr631; Tyr662 & Tyr666 stack adjacently on the P1 residue, with the P2 and P1' side chains facing into cavity of the active site. Residues Glu205 and Glu206 provided by blade IV of the β-propeller domain interact with the N terminus of peptide substrates, indicating the requirement of a positively charged substrate N terminus. 13 As can be seen from the gliptins shown in figure 5, the compounds which can inhibit HsDPPIV activity are highly variable in their appearance; however all the current gliptin inhibitors contain either a cyanopyrrolidine functional group (a 5 membered pyrrolidine ring with nitrile moiety, also pyrrolidine-2nitrile), such as vildagliptin and saxagliptin, or they may contain either a modified cyanopyrrolidine ring or a pyrrolidine ring with the nitrile present in cyanopyrrolidine replaced with hydrogen, fluoro, acetylene, or methanol functional groups(54). The presence of a nitrile group enhances the potency of such drugs by inducing partial transient covalent trapping of the Ser630 hydroxyl of HsDPPIV by the nitrile, as well as hydrogen bonding with nearby Tyr547(55). Studies on saxagliptin(56) indicated a 2 step mechanism by which an initial encounter complex is formed, followed by a covalent intermediate formation. Ionization of the Asp708-His740 catalytic pair enhances Ser nucleophilicity, and thus its ability to be targeted for covalent addition. Hydrogen bonding of the P2 terminal amine of the inhibitor to residues Glu205/Glu206 in the enzyme active site contributed from the β-propeller involves very short, strong hydrogen bonding formation characterised using 1H NMR. Positioning of the pyrrolidine ring between the stacks of the S1 binding residues Tyr662 and Tyr666 also provides additional binding energy(53). Generally speaking inhibitors lacking the pyrrolidine-2-nitrile ring have 10-fold reduction in their potency. Weaker inhibitors based on Valine pyrrolidide showed hydrogen bonding of carbonyl group of the inhibitor to residues Asp124 and Asn710. Experiments on non cyanopyrrolidines revealed that alteration of steric constraints of the pyrrolidine ring in the S1 pocket greatly reduced drug potency, such as the introduction of a methyl group which effectively destroyed all inhibitor potency(54). Stereochemical bulking of the S2 binding region however showed an increase in inhibition. The observations described above have made it clear that the structure and positioning of residues in the active site of HsDPPIV play a clear role in its related function; the aforementioned structure-function relationship. As mentioned previously, structural characteristics of the HsDPPIV active site have seen use in optimisation of the selection of inhibitors that bind the active site. Such knowledge can provide insight into development of similar inhibitors of PgDPPIV, provided there is great enough homology of the active site between two homologs. If there is strong homology in the active site residues of PgDPPIV, then may still be possible to identify other regions that are unique to PgDPPIV, such as the hydrogen bonding of Valine pyrrolidide inhibitor described above. 14 1.4.2 - Sus scrofa dipeptidyl peptidase IV The 766 residues long SsDPPIV shows high sequence similarity to its Eukaryotic ortholog HsDPPIV, bearing 88.38% sequence similarity. With PgDPPIV it shares 25.45% similarity, the same as PgDPPIV does with HsDPPIV. High sequence homology to HsDPPIV may imply conserved function; however the structure of SsDPPIV further elucidated a mechanism which has thus far not been observed in HsDPPIV. The 1.8 Å resolution structure of native SsDPPIV which was solved by Engel in 2003(57) revealed that unlike the homodimeric structure observed in HsDPPIV crystals, SsDPPIV has been shown to crystallise as both a homodimer and as a homotetramer (figure 11), with tetramerisation believed to be involved in cell-cell contacts at the cell surface. The tetramerisation of 2 homodimers is brought about via interactions between the glycosylated blade IV of the β-propeller domain, which comprises hydrophilic residues Asn279-Gln286. The residues from each dimer form an anti-parallel β-sheet, with further contributions from blade V. From the α/β hydrolase domain, helix Met746–Ser764 and the loop comprising residues Phe713-Thr736 (in particular helix Gln714–Asp725 and strand Asp729–Thr736; similar to HsDPPIV) constitute the central dimerisation motif as in HsDPPIV The 2nd β-sheet from the blade IV subdomain also contributes to dimerisation via stabilization of these loops, like in HsDPPIV. The N-terminal β-propeller domain comprises residues Arg54-Asn497, and the α/β hydrolase domain comprising residues Gln508-Pro766. The I & VI-VIII, II-V asymmetric 8-bladed topology of the β-propeller domain is consistent with that found in HsDPPIV, with the first blade (Phe53–Tyr58) and last blade (Glu499–Met503) forming a non-covalent linkage. Again there are 2 solvent-exposed openings to the active site; through the β-propeller and through an exposed side entrance generated by the kinked arrangement of blade 1 and 2-4. The diameter of the β-propeller opening is 9 Å and 15 Å from blade IV to VIII and from blade II to VI respectively, with the tunnel widening to 15 Å and 25 Å towards the interface. The distance from the surface of the opening to the active site is 37 Å. The side entrance is oval with dimensions of 15 and 22 Å, and measures 20 Å from the surface to the active site. The blades are a stabilized by numerous Cysteine disulfide bonds, which comprise all disulfide bonds in the monomer aside from Cys649-Cys762, which cross-links the C-terminal helix Met746-Ser764 with the βsheet Cys649-Val652, stabilizing the α/β hydrolase domain, as in HsDPPIV. Similarly all glycosylation sites are present in the β-propeller domain apart from Asn685, half which are orientated away from the α/β hydrolase domain. Only Asn279 on blade IV is post-translationally modified. 15 The active site of HsDPPIV and SsDPPIV are also highly conserved, being almost identical. The catalytic triad comprises residues Ser-630, Asp708 and His740. The hydrogen donors of the oxyanion stabilization pocket are Tyr631 and Tyr547. The pyrrolidine ring is accommodated by a hydrophobic pocket formed by side chains of Tyr666, Tyr662, Val711, Val656, and Trp659. In crystallization with an inhibitor, the P2carbonyl oxygen sits in an electrostatic pocket formed by the side chains of Arg125 (positioned on the hairpin loop between strands 2 and 3 of blade II) and Asn710. Glu205 and Glu206, positioned on a short helical insertion within strand 1 of the β-propeller blade IV, interact with the free amino terminus of the P2- residue. Figure 11: Cartoon representation of the SsDPPIV homotetramer, taking on a quaternary conformation as a dimer of dimers. Rendering was accomplished with Pymol, using crystal structure 2aj8 from the PDB. 16 1.4.3 - Stenotrophomonas maltophilia dipeptidyl peptidase IV To date, the 741 residue SmDPPIV (figure 12) is the only bacterial dipeptidyl peptidase IV for which the structure has been solved(58). Sequence alignment with Clustal W2 indicates that PgDPPIV and SmDPPIV share 27.66% of their sequence, a higher similarity than that observed with HsDPPIV and SsDPPIV. However this similarity is much lower than perhaps would perhaps be expected between bacterial species, compared to the homology demonstrated between the 2 eukaryotic orthologs. The GWSYGGφφ sequence found in the conserved DPPIV motif differs in SmDPPIV, where the sequence is GWSNGGYM, with the introduction of Asparagine predicted to participate in substrate recognition of 4hydroxyproline, as evidenced by a N611Y mutation resulting in a decrease to 30.6% of the wild type hydrolytic activity. Interestingly Asparagine is also found in the same position in the GxSxGG consensus of POP, for which there no observed bacterial homologs. The 2.8 Å resolution structure of SmDPPIV is again very highly conserved, with the key topological features of both the α/β hydrolase domain (Gln484-Pro741) and the β-propeller domain (Leu39-Ala483) being very similar to those found in HsDPPIV and SsDPPIV, however it contains an α-helix (containing a Glu206/Glu207 di-Glutamate motif) situated between the 1st and 2nd sheets of blade IV that is not observed in the eukaryotic homologs. Additionally the 2nd and 3rd sheets of blade II are shorter than that found in SsDPPIV, generating a larger side entrance. Arg125, which is believed to take part in substrate recognition, is located in between these 2 blades, but is displaced from the active site. Finally, the βpropeller is also displaced relative to its position from the α/β hydrolase domain in other DPPIV enzymes, following a rotation of 10°. The catalytic triad comprises residues Ser610, Asp685, and His717, with the active site sitting in the domain interface. Residues Tyr524 and Asn611 comprise the oxyanion stabilising residues. Val636, Trp639, Tyr642, Tyr646, and Val688 form the hydrophobic binding pocket. As mentioned previously residues Glu206/Glu207 provide the N-terminus substrate binding pocket. The residues found in the active site of SmDPPIV are shifted from those residues in the active site of SsDPPIV, leading to a substantial increase in the size of the hydrophobic binding pocket of SmDPPIV. 17 Figure 12: Cartoon representation of the SmDPPIV homodimer. Rendering was accomplished with Pymol, using crystal structure 2ecf from the PDB. 1.4.4 - P. gingivalis prolyl tripeptidyl peptidase In 1999 Banbula et al were able to deduce the presence of another oligopeptidase that took part in the growth and host-evasion mechanisms of P. gingivalis; prolyl tripeptidyl peptidase (PgPTP)(59), also a member of the S9B subfamily. Failure to inactivate with Cysteine and metalloproteinase inhibitors, combined with observed inactivation with diisopropyl fluorophosphates (DFP) implicated its role & mechanism as a serine peptidase. Clustal W2 sequence alignment of the 732 residue PgPTP revealed the presence of the DPPIV family consensus sequence GWSYGG, and a total sequence similarity of 25.86% with PgDPPIV. The structure of PgPTP was resolved at 2.1 Å by Kiyoshi Ito et al in 2006 (figure 13)(60). Again overall topology of both the α/β hydrolase domain (Asn471–Leu732) and the β-propeller (Glu45– Lys470) is highly conserved. Like SmDPPIV, there are differences of the β-propeller domain compared to the eukaryotic homologues, however some of these differences can be attributed to this enzymes function as a tripeptidyl peptidase, and not a dipeptidyl peptidase, such as the widening of the side 18 entrance to facilitate larger substrates. The subunit interface is dominated by hydrophobic interactions and contained four salt bridges between Lys232 and Glu245 (of blade IV subdomain Lys232–Thr256), and between Arg669 and Asp730 (Gly688–His731 loop), as seen in the homologs. Perhaps paradoxically PgPTP bears little structural homology to other tripeptidyl peptidases. The catalytic triad consists of residues Ser603, Asp678 and His710. The hydrophobic pocket consists of residues, Val629, Trp632, Tyr635, Tyr639 and Val680, with an additional residue, Val681, also participating the hydrophobic binding of the substrate. Tyr518 and Tyr604 comprise the oxyanion stabilising residues, much as they do in the thus far described dipeptidyl peptidases. The glutamate motif is again present as residues Glu205, however the hydrogen bonding role of Glu206, which is not present in PgPTP, is replaced by Glu636. Figure 13: Cartoon representation of the PgPTP homodimer. Rendering was accomplished with Pymol, from crystal structure 2d5l from the PDB. 19 1.5 - Structural Biology: X-ray Diffraction/MAD X-ray diffraction is used to determine the electron density of the molecules that constitute the crystal, which can then be interpreted to fit an atomic model, which in current times is done computationally. Xrays are fired at a fixed crystal, which interact with the electron cloud of the electrons surrounding the molecules within the crystal. The x-rays are diffracted off the electrons, and are detected using an x-ray detector, typically an imaging plate, which measures the intensities (I) and positions of the diffracted xrays. By crystallising a molecule of interest, a lattice is formed composed of repeating units (referred to as unit cells), where molecules within the unit cell which have no internal planes of symmetry are referred to as asymmetric units. Crystallisation of molecules permits amplification of the diffracted xray, as the intensity from a single molecule too small to be detected. The electronic density is defined by the equation , where are the real-space Cartesian coordinates, h, k, â„“ are the miller indices (integers describing the orientations of a plane or set of planes within a lattice in relation to the unit cell), V is the volume of the unit cell, is the phase of the incident x-rays (a mathematical function describing the fraction of a sinusoidal wave cycle that has elapsed relative to the wave cycle origin), and F is a structure factor (a mathematical description of the ability of an atom to scatter incident x-rays), where F(h, k, â„“)2 ∝ I(h, k, â„“). Determination of phase is important because phase cannot be detected by any currently available detectors. Multi-wavelength anomalous dispersion is one such technique used to determine phases, which makes use of an anomalous scatterer. Anomalous scattering is scattering that results in the change of of the incident radiation as a result of inelastic collision with the scatterer. This results in an electronic transmission of a low energy electron to a higher electron shell. An electromagnetic wave causes an electronic transition when the energy of the incident photon at a given wavelength, is equal to the energy difference between the electron shells. The incorporation of selenium into methionine residues provides a heavy atom which acts as an anomalous scatterer. For selenium this usually occurs as a transition from the K shell (1S subshell) to the 5S subshell, which occurs at a photon wavelength 0.9795 Å, corresponding to the x-ray region of the electromagnetic spectrum. The energy of this transition corresponds to approximately 12.7 keV, which is described as the K shell absorption edge of selenium. In MAD the change in when scattering is observed at multiple wavelengths allows for the determination of the phases of the incident x-rays of the crystal. 20 1.6 - Aims The primary objective of this project was to build, refine and characterise an atomic model of the threedimensional structure of PgDPPIV, using crystals generated by crystallographic techniques to collect xray diffraction data which can be utilised to aid in the goal. Along with a standard synchrotron x-ray dataset, selenomethionine-derived crystals would be utilised with multi-wavelength anomalous dispersion (MAD) to resolve the crystallographic phases of the crystal. The determination of the threedimensional structure, how it relates to the enzymatic activity of PgDPPIV, and structural comparatives with the structurally-available homologs of PgDPPIV (HsDPPIV, SsDPPIV, SmDPPIV, PgPTP) will hopefully provide some new insight into the design of selective PgDPPIV inhibitors, providing potential treatment for adult periodontitis and other PgDPPIV-associated disease states. 2 - Materials & Methods 2.1 - Expression, purification & crystallisation Preliminary expression, purification and crystallisation of PgDPPIV was carried out by Rea D et al(31). A plasmid vector containing PgDPPIV (Thr21-Lys723) and an N-terminus polyHistidine-tag was constructed and used to transform Escherichia coli strain BL21 (DE3). After extraction, PgDPPIV was purified using Nickel Selenomethionine-derived affinity crystals were chromatography. prepared by Figure 14: Visual light photograph 2.7 Å transformation of auxotrophic E. coli strain B834 (DE3) and diffracting resolution PgDPPIV crystals, with the supplementation with proteinogenic amino acids (excluding largest dimension of 0.3 mm. methionine) and Selenomethionine, followed by purification as above. Both were crystallised using the hanging drop vapour diffusion technique in 40% 2-methyl-2,4-pentanediol (MPD) and 100 mM Tris-HCl pH 7.5 buffer, with subsequent microseeding at 35-40% MPD and 100 mM Tris-HCl pH 8.0 several months after initial crystallisation. 21 2.2 - X-Ray Diffraction 2 x-ray diffraction datasets were collected ahead of the start if the project: standard synchrotron radiation diffraction of the hanging drop vapour diffusion crystals, which yielded an electron density with a resolution of 2.5 Å, and a 3 wavelength MAD dataset from the selenomethionine-derived crystals. Tuneable wavelengths were selected near the absorption edge of the selenium. Initial synchrotron data collection and processing statistics were not available to be listened in this section, however table 1 contains partial processing statistics acquired from the data provided for this project. This includes the X-ray diffraction pattern for PgDPPIV shown in figure 15, which was generated from the provided PgDPPIV mtz file, which stores the reflection data. Figure 15: FPK X-ray diffraction pattern of PgDPPIV, acquired from the provided mtz reflections. 22 2.3 - Data Processing 2.3.1 - Solve/Resolve The Solve/Resolve suite was used for preliminary solution of the crystallographic phases of the PgDPPIV crystal using the MAD dataset, as well as being used for the initial automated model building of the PgDPPIV asymmetric unit (61-67). This was done in advance of starting work on the project. 2.3.2 - WinCoot Refinement and manipulation of the atomic coordinates of PgDPPIV was carried out using the WinCoot (Crystallographic Object-Oriented Toolkit) software, on the windows operating system(68, 69). For the sake of practicality, only a single monomer of PgDPPIV was refined, with later superpositioning being carried out on other monomers present in the asymmetric unit. For each residue, electron density fitting, geometric restraint fitting, rotamer fitting, Asn/Gln B factor analysis and intermolecular bonding environment analysis were assessed, as well as H2O positioning and hydrogen bonding environments. 2.3.3 - CPP4 The CCP4 suite(70) was used for much of the refinement which was used in conjunction with model building. Programs contained within the CCP4 suite where accessed using the windows ccp4i GUI(71). Software packages that were used in the refinement primarily include Refmac5(72, 73) for automated refinement, rampage(74) for stereochemical verification of the built model (via construction of Ramachandran plots; bond angle φ (N-Cα bond) against bond angle ψ (C-Cα bond), superpose(75) to superposition the secondary-structure of refined monomer of PgDPPIV onto other monomers present in the asymmetric unit prior to Refmac5 refinement, and Baverage(70) which was used for the B factor analysis of the constructed model. Additional sfcheck(76) was used for evaluating other structure-factor parameters. Liberation Screw-motion (TLS) refinement protocols(77) were also utilised in Refmac5 refinement of PgDPPIV. TLS is defined as a mathematical method used to predict the local displacement of atoms that are part of a rigid body, i.e. domains, which displace around a mean position. TLS bodies were determined by input of restrained refinement of PgDPPIV to the TLS Motion Determination Server. 23 The asymmetric unit of the PgDPPIV crystal consists of 8 molecules; model building of the electron density has shown that the PgDPPIV crystal comprises 4 monomeric PgDPPIV chains (A-H), which are arranged as a homotetramer of 4 homodimers. The unit cell symmetry of the PgDPPIV crystal is characterised by a P21 space-grouping, defining a monoclinic crystal system with a single axis of crystallographic symmetry from the position of an asymmetric unit at coordinates to an asymmetric unit at the coordinates as a result of an 180Ëš screw turn (figure 16). The asymmetric unit of PgDPPIV is shown in figure 17(78), with a visual representation of Figure 16: Diagramatic representation of the P21 monoclinic space group, showing the 180Ëš screw turn symmetry of the assymetric unit and it’s symmetry mate at . the symmetry of the asymmetric unit in the same cell shown in figure 18. The unit cell parameters for a, b, c (Å) and α, β, γ (°) are 117.0, 112.9, 310.9, 90.0, 95.0 and 90.0 respectively. Table 1 shows the refinement statistics for 3 output models of PgDPPIV: the unrefined model, the refinement model using restraint fit without TLS, and the refinement model using restrained fit with TLS. Refinement without TLS was carried out at 30 cycles of restrained fit refinement, while refinement with TLS was carried out with 20 cycles of TLS refinement, followed by 20 cycles of restrained fit refinement. For TLS refinement, 3 designated TLS bodies were available for each monomer: residues 21-36 (Nterminus), residues 37-469 (β-propeller domain) and residues 470-723 (α/β hydrolase domain). The model built with solve/resolve had an initial R-factor of 28.1%, and an Rfree value of 31%. The Rfactor (or reliability factor), is a parameter described by the equation , which is a measure of the difference between the observed structure factors (F) of the original electron density map (Fobs) and the sum of the structure factors from the rebuilt electron density map calculated from 24 the model (Fcalc). Thus a lower R-value is indicative of the fit of the built model to the original map. The R-value typically ranges from 0.6 for disordered molecules to ~0.2 for organised macromolecules. Rfree is a variable used to eliminate R-factor bias from refinement; and are usually measured as is done for Rfactor minus a fixed percentage of the data set (typically 5-10%). Rfree is typically higher than the Rfactor. Analysis shows that restrained fit without TLS refinement reduced the R-factor of the observed model to 21.3%, and the Rfree value to 26.9%, while the TLS-refined model had an R-factor of 23.9% and an Rfree value to 28.2%. Structural analysis used in figures featured in this section will utilise the lowest Rfactor model. Table 1: Refinement parameters for PgDPPIV crystal data sets; unrefined, refinement using restraint fit without TLS, and refinement using restrained fit with TLS. *These B factor values represent the average across all chains (A-H). **B factor values for water molecules may be incorrect, as there were no subsequent refinements of the model with waters incorporated into the solvent. Refinement parameters Unrefined Refined TLS Refined Resolution range (Å) Number of used reflections Percentage Observed (%) Percentage of free reflections (%) Overall correlation coefficient Free correlation coefficient Cruickshank's DPI for coordinate error (Å) DPI based on free R-factor (Å) Overall figure of merit R-factor Rfree Wilson B factor (Å2) Average B Factors* Total protein atoms (Å2) Main chain atoms (Å2) Side chain atoms (Å2) Water molecules (Å2) RMS Deviations Bond Lengths (Å) Bond Angles (°) Chiral Volume Average B factor Main chain atoms (Å2) Average B factor Side chain atoms (Å2) 77.429-2.5 270,646 99.0897 1.9945 0.8910 0.8696 0.5016 0.3190 0.7513 0.2814 0.3097 - 77.429-2.5 270,646 99.0907 1.9945 0.9381 0.9035 0.3892 0.2792 0.8077 0.2129 0.2688 64.18 77.429-2.5 270,646 99.0907 1.9945 0.9218 0.8933 0.4255 0.2909 0.7935 0.2387 0.2824 64.18 26.299 25.749 26.839 - 49.928 47.614 51.436 30** 16.281 15.372 17.186 - 0.0127 1.6723 0.1222 0.388 1.270 0.0156 1.9147 0.1087 2.760 3.409 0.0160 1.9042 0.1063 0.733 1.067 Stereochemistry remains relatively unchanged with refinement. Polypeptide backbone stereochemistry assessed with rampage showed that for the unrefined model, 5115 (91.2%) residues fall in the favoured region and 360 (6.4%) residues reside in the allowed region. 133 (2.4%) residues were outliers. The final 25 non-TLS refined model 5087 (90.7%) were in favoured regions, 386 (6.4%) were in allowed regions and 135 (2.4%) in outlier regions. The TLS refined model contained 5137 (91.6%) residues in the favoured region, 347 (6.2%) residues in the allowed region and 124 (2.2%) in the outlier region. B factors are a parameter that define the motion of an individual, and are often referred to as "temperature factors", as increased motion correlates with increase in temperature (this is a conferred of advantage of using a cryostream in x-ray diffraction, as it reduces temperature and hence motion, allowing for higher resolution structures). A smaller B factor is indicative of lower atomic motion, and vice versa. Analysis of the models show distinctive sets of B factor data for each model. The original model had an average B factor for all protein atoms of 26.3 Å2; for the non-TLS refined model this value was 49.9 Å2, while for the TLS refined model this was 16.3 Å2. The average B factor of the main chain atoms were 25.7 Å2, 47.6 Å2 and 15.4 Å2 respectively, while the average B factor for side chain atoms were 26.8 Å2, 51.4 Å2 and 17.2 Å2 respectively. The average B factors across each chain for the original model are also consistent with one another (table 2). For the non-refined TLS model (table 3) this consistency was not observed: B factors remained consistent with chains A-F, however chain G has a higher average B factor of all atoms of 54.7 Å2, while chain H further deviates with an average B factor of 87.0 Å2. Figure 19 shows a localised B factor putty of chains A-H, which shows that for chains A-H higher B factors generally reside within the β-propeller, with some higher B factors at the N-terminus. In chain G the B factors of the α/β hydrolase are slightly exaggerated, and for chain H the highest B factors show observable bias in β-propeller domain. The Average B factor for total protein atoms of the TLS-refined model (table 4) also showed similar inconsistencies, with chain B presenting higher average B factors (36.0 Å2), while chains E and H showed lower average-B factors (6.8 Å2 and 7.0 Å2). Across the entire asymmetric unit of the non-TLS refined model, 760 H2O molecules (avg. 95 per monomer) where placed in electron density signal peak positions with RMS deviation of 1. The minimum water distances were assigned to 2.4 Å, with maximum water distances set to 3.2 Å. The B factor of the waters positioned was 30 Å2, however as refinement could not be carried out post-addition of H2O due to software complications, this may not be totally representative. RMS deviations of Bond Lengths (Å) and Bond Angles (°) are variables associated with a restrained fit refinement. A value close to 0 is indicative of an overrepresentation of geometry in the atomic model. Refinement where these values are equal to 0 is referred to as a rigid fit, which may or may not correlate with the atomic fitting of the model to the electron density. For the original model these 26 values are 0.0127 Å and 1.6723° respectively, while for non-TLS refinement they are 0.0156 Å and 1.9147°, and for TLS refinement they are 0.0160 Å and 1.9042°. Increase of these values with restrained refinement is indicative of reduced geometric constraints of the atomic model. Tables 2-4: Table of average B factors & RMS deviations for 2) the unrefined model, 3) non-TLS refined model and 4) TLS refined model. B factors acquired using Baverage. Chain Total protein atoms (Å2) Main chain atoms (Å2) Side chain atoms (Å2) ALL A B C D E F G H 26.299 26.000 27.936 26.075 25.940 25.713 26.127 26.216 26.313 25.749 25.436 27.404 25.466 25.375 25.134 25.567 25.682 25.856 26.839 26.562 28.467 26.682 26.504 26.291 26.686 26.749 26.768 Chain Total protein atoms (Å2) Main chain atoms (Å2) Side chain atoms (Å2) ALL A B C D E F G H 49.928 41.989 44.406 40.805 44.914 40.277 45.359 54.651 87.024 47.614 39.798 41.928 38.454 42.719 38.080 41.085 52.480 86.366 51.436 44.173 46.876 43.148 47.102 42.467 43.225 56.814 87.681 Chain Total protein atoms (Å2) Main chain atoms (Å2) Side chain atoms (Å2) ALL A B C D E F G H 16.281 17.371 35.966 16.081 14.854 6.765 16.165 16.027 7.015 15.372 16.336 33.823 14.991 13.982 6.518 15.137 15.264 6.924 17.186 18.402 38.101 17.168 15.724 7.011 17.191 16.787 7.105 Average B factor RMS main chain atoms 0.388 0.384 0.470 0.405 0.379 0.398 0.398 0.358 0.314 Average B factor RMS side chain atoms 1.270 1.288 1.523 1.362 1.229 1.313 1.298 1.164 0.979 Average B factor RMS main chain atoms 2.760 2.178 2.298 2.171 2.531 2.200 2.362 2.879 5.458 Average B factor RMS side chain atoms 3.409 3.091 3.367 3.280 3.211 3.149 3.158 3.352 4.667 Average B factor RMS main chain atoms 0.733 0.734 1.908 0.763 0.639 0.312 0.723 0.614 0.168 Average B factor RMS side chain atoms 1.067 1.171 2.595 1.234 0.962 0.430 1.132 0.840 0.172 27 Figure 16: Cartoon rendering of the 8-chained asymmetric unit of PgDPPIV. The asymmetric unit is topologically homotetramer of organised homodimers. as a Water molecules present in the solvent are shown as small magenta coloured spheres. Rendering was accomplished with Pymol. Figure 17: Cartoon representation of the symmetry of the asymmetric unit of PgDPPIV in the unit cell. Figure 17A shows the relative front view of the unit cells as shown in figure 15, along primary axes a and c. Figure 17B shows the relative top view of the unit cell, along axes b and c. Figure 17C shows the side view of the unit cell, along axes a and b. Rendering was accomplished with Pymol. 28 Figure 18: Cartoon B factor putty representation of the monomers constituting the asymmetric unit of PgDPPIV; Chains A to H as labelled. The β-propeller is oriented to the left, while the α/β hydrolases are oriented to the right. Rendering accomplished with Pymol. 29 The active dimeric and monomeric models of PgDPPIV are shown in figure 19. The N-terminus region of the protein extends past the α/β hydrolase domain, as is seen in other homologs. Transmembrane region prediction using TMHMM(79, 80) suggests residues Met1-Pro4 form the periplasmic portion of the protein, with residues Val5-Gly22 forming the transmembrane spanning region of the protein. As described previously, the α/β hydrolase domain comprises residues 470-723, and the β-propeller domain residues 37-469: these are shown in figure 20. Tables 5 and 6 show the residue positions of the secondary structure elements of the β-propeller and the α/β hydrolase domains respectively. Assessment of these secondary structure elements shows that the β-propeller domain contains some deviations of what we know about the topological secondary structure organisation of this domains; specifically, there are a number of β-sheets not implicated in the structure of the propeller that would be expected: blades I-III and V-VIII containing 4 β-sheets, blade IV containing 7 β-sheets (5 in the main blade, and 2 in the subdomain), which we do not see in PgDPPIV. For blades I-VIII we see 3, 4, 3, 7, 4, 4, 3 and 3 β-sheets respectively. Sequence alignment of blade structure with SmDPPIV indicate that both blades I & III lacks β-sheet 1 (Arg43-Ser47, Ile144-Phe147), while blades VII & VIII lack β-sheet 4 (Thr410Lys411, Lys452-Arg455). Additional α-helices are present in blade II residues (Val127-Arg129) and blade VI (Ser328-Ala333). The α/β hydrolases shows much less discrepancy in secondary structure; however an extra α-helix is present between β6 and β7, although it is possible this could be due to an interruption of an α-helix present in other homologs. It is difficult to determine which residue this constitutes. The solvent-exposed opening of the β-propeller shows an ellipsoid topology, measuring 19.4 Å between blades III and VII at its greatest and 12.9 Å between blades I and V at its shortest. The cavity opens towards the interface, although this is difficult to measure due to the bending of blades I-III resulting in a loss of oval geometric symmetry and the formation of the side entrance. The distance between blades IV and VIII measures 19.3 Å, while the distance between blades I and VI measure 19.7s Å. The average distance of the catalytic serine to the blades at the β-propeller entrance is 42.4 Å. The solvent-exposed side entrance is slightly larger than that of the β-propeller entrance, with diameters of 23.7 Å and 20.5 Å. The entrance sits approximately 25.5 Å from active site serine. The diameters of the β-propeller entrance are slightly larger than those of comparable measurements in HsDPPIV and SsDPPIV, and may perhaps have some role the opening of the active site to slightly larger oligopeptide substrates. The side entrance conversely, is only slightly larger than in the eukaryotic homologs. 30 C Figure 19: Cartoon representation of the dimeric configuration of PgDPPIV from A) the front perspective B) top-down perspective, and C) the monomeric configuration of PgDPPIV. Rendering was accomplished with Pymol. A B Figure 20: Cartoon representations of A) the α/β hydrolase domain of PgDPPIV, showing the 8 β-sheet topology of the domain and B) the βpropeller domain of PgDPPIV, showing the 8 bladed topology and pore of the domain. Secondary structure features are also annotated. Rendering was accomplished with Pymol. 31 Tables 5 & 6: Table of 5) residues comprising the secondary structure elements of the PgDPPIV β-propeller domain and 6) residues comprising the secondary structure elements of the PgDPPIV α/β hydrolase domain, based on automatic secondary structure determination by Pymol. Sub Domain N-terminus Blade I Blade II Blade III Blade IV Blade V Blade VI Blade VII Blade VIII Secondary Structure α1 β1A β1B β1C β2A β2B β2C α2 β2D β3A β3B β3C β4A α3 β4B β4C β4’1 β4’2 β4D β4E β5A β5B β5C β5D α3 β6A β6B β6C β6D β7A β7B β7C β8A β8B β8C α4 Residues Leu27-Ser32 His52-Met57 Ala63-Asn68 Val75-Ser80 Gln93-Val97 His103-The108 Ala121-Asp126 Val127-Arg129 Asn130-Pro134 Met153-Arg158 Asn161-Lys166 Asp170-Gln174 Ile184-Asn186 Trp191-Phe197 Met203-Trp205 Phe211-Asp218 Glu224-Met229 Glu237-Lys242 Thr252-Asn259 Arg263-Val268 Arg280-Phe283 Leu290-Leu295 Asp301-His308 Leu312-Met321 Ser328-Ala333 Lys335-Ala338 Gly340-Ser346 His353-Tyr357 His364-Arg366 Thr375-Asp381 Gly384-Ser390 Arg398-Tyr401 Thr418-Phe423 Tyr429-Ser435 Val442-Arg448 Val461-Ala469 Sub Domain α/β hydrolase Secondary Structure β1 β2 β3 α1 β4 α2 α3 β5 α4 β6 α5 α6 α7 α8 β7 α9 β8 α10 Residues Glu476-Thr482 Gly485-Val493 Val506-Gln510 Trp527-Lys534 Val537-Asp542 Glu551-Thr557 Val563-Gln578 Ile587-Trp592 Ser593-Gly606 Ala612Val616 Trp622-Phe624 Ser627-Met634 Ala641-Ser647 Ala649-Gln655 Asn659-Gly665 Leu673-Ala686 Asp691-Tyr695 Thr707-Asn722 Regions evidenced to be involved in dimerisation in HsDPPIV are also present in the PgDPPIV model. The 2 anti-parallel β-sheets of the blade IV subdomain comprise of residues Pro223-Lys242. The αhelix/β-sheet loop which functions in catalytic triad relaxation upon dimerisation is composed of residues Leu673-Met696. Residues Trp622-Phe624 constitute the short α-helix that also contributes to the dimerisation interface. These conserved dimerisation motifs are indicated in figure 21. 32 A Figure 21: Carton representation of PgDPPIV homodimer with highlighted residues dimerization: involved Residues in Trp622– Phe624 (red) and Leu673–Thr696 (blue) from the α/β hydrolase domain and the antiparallel β-sheet stack subdomain (Pro223-Lys242; yellow) protruding from the 2nd βsheet of blade 4 in the β-propeller domain. Figure 21A is from a front perspective, and figure 21B is a top- B down perspective. Rendering was accomplished with Pymol. The primary dimerisation motif, the subdomain of blade IV of the β-propeller, is the only dimerisation region that forms hydrogen bonds with the monomeric partner. The 2 β-sheets are residues Glu224Met229 and Glu237-Lys242. Figure 22 shows the hydrogen bonding environment of the interface. The intermolecular contacts are formed by hydrogen bonding involving 3 residues; the hydrogen donor Arg226, and the hydrogen acceptors Asp238 and Pro236 of the partner monomer. The NE nitrogen of the side chain guanidium group of Arg226 hydrogen bonds to the carboxyl oxygen of the side-chain of Asp238, with a distance of 2.8-3.0 Å, while the NH1 nitrogen of the guanidium group hydrogen bonds to the backbone carbonyl oxygen of Pro236, at a distance of 2.8-3.4 Å. Additionally, NH2 nitrogen of Arg226 extending from chain B also hydrogen bonds to the side chain carbonyl oxygen of Glu224 from the same chain, at a distance of 3.3 Å. This hydrogen bond is not observed in chain A; which has a distance of 5.2 Å, due to positioning of the Glu224 away from the interface. Figure 22: Ribbon/line representation of the hydrogen bonding environment of the primary dimerisation region of PgDPPIV; residues Pro223-Lys242. Rendering was accomplished with Pymol. Figures 23 and 24 show the structural alignments of the α/β hydrolase and β-propeller domains (respectively) of PgDPPIV with the comparable domains of HsDPPIV, SsDPPIV, SmDPPIV and PgPTP. Alignment with Pymol was then followed by quantification of the RMSD between the 2 aligned structures. For α/β hydrolase domain alignment, the RMS deviations between each homolog are 17.135, 16.269, 15.932 and 16.994 respectively. Likewise for the β-propeller domain, the RMSDs measure 17.799, 17.941, 17.252 and 17.476. RMS deviations of overall structure alignment were 22.065, 23.091, 18.303 and 18.350. The determined RMS deviations values indicate that for all 3 structural components that were aligned, SmDPPIV shows the greatest structural similarity. From overall structure alignment the prokaryotic homologs are better fits than the eukaryotic homologs, with very little difference in RMS deviations between eukaryotic and prokaryotic homologs. HsDPPIV has the worst fit to the α/β hydrolase domain while SsDPPIV has the worst fit with the β-propeller domain, although for the βpropeller domain there is little variation in RMS deviation values. 34 A B Figure 23: Ribbon representation of the structural alignment of the α/β hydrolase domain of PgDPPIV (magenta) with the homologous α/β hydrolase domains of A) HsDPPIV, B) SsDPPIV, C) SmDPPIV and D) C D A B PgPTP. Rendering accomplished with Pymol. Figure 24: Ribbon representation of the structural alignment of the β-propeller domain of PgDPPIV C D (magenta) with the homologous α/β hydrolase domains of A) HsDPPIV, B) SsDPPIV, SmDPPIV and Rendering accomplished D) C) PgPTP. with Pymol. 35 Figure 25: Stick/cartoon diagram of the PgDPPIV active site. Shown are the S1 hydrophobic residues, V619, W622, Y625, Y629 and V671(magenta), oxyanion stabilising residues Tyr511 & Tyr594 (green), catalytic triad Ser593, Asp668 & His700 (orange) and Glu195-Glu196 contributed from the β-propeller domain (cyan). Rendering accomplished with Pymol. The active site of PgDPPIV is shown in figure 25, based on selection of residues from sequence alignment with its homologs. Sequence alignment suggests that the residues involved in substrate binding or catalysis are identical to those found in eukaryotic homologs. Alignments of the active site of PgDPPIV with all 4 of its homologs are shown in figure 26. As expected of a serine peptidase, the catalytic triad of PgDPPIV is comprised of residues Ser593, Asp668 and His700. The oxyanion stabilising residues are Tyr511 located upstream of the catalytic triad, and Tyr594 neighbouring the catalytic serine. The S1 hydrophobic binding pocket is formed by residues Val619, Trp622, Tyr625, Tyr629 and Val671. The diGlumate residues which bind the amino-terminus of PgDPPIVs oligopeptide substrate are Glu195 & Glu196, which are provided by the β-propeller as they are in HsDPPIV/SsDPPIV/SmDPPIV. 36 Figure 26: Stick/cartoon diagram of the PgDPPIV active site (magenta) aligned with the homologous active sites of A) HsDPPIV, B) SsDPPIV, C) SmDPPIV and D) PgPTP. Rendering accomplished with Pymol. RMS deviations values for active site alignments were 0.926, 0.912, 5.148 and 7.074 for each homolog respectively. This is in agreement with the sequence alignment of the active site residues, where prokaryotic homologs with differing residues in the active site showing expectedly less similarity. Combined structural alignment and sequence alignment of the active site suggests stronger homology of eukaryotic functional features of the DPPIV family, particularly concerning the core of the α/β hydrolase domain, despite the weaker structural homology observed in the tertiary structure of this domain. Compared to HsDPPIV, the PgDPPIV active site shows a general slight shift in the active site. The diGlutamate motive is shifted the most, measuring a 0.9-1.4 Å shift, while the catalytic triad; Ser593, Asp558 and His700 are shifted 0.7-0.8 Å (2.6 Å shift of the oxygen due to alternate rotamer conformation; see discussion), 0.6-0.8 Å and 0.4-0.6 Å, respectively. Tyr629 is shifted the least, with the shift measuring 0.4-0.6 Å. SsDPPIV shows a more varied of the shift active site. Tyr511 is rotated with 37 the aromatic rotated to a perpendicular rotamer conformation (100°). The shift is shorter at the backbone (0.9-1.0 Å), while at the C1, C4 and oxygen atoms the shift increases by 1.4 Å, 2.3 Å and 2.8 Å respectively. The catalytic triad is shifted 0.4-1.0 Å, 0.5-0.8 Å and 0.5-0.8 Å for Ser593, Asp558 and His700 respectively. Tyr625 and Tyr629 show the smallest shift, shifting 0.5-0.6 Å and 0.4-0.6 Å respectively. The shifts compared to both homologs shows a slightly more closed conformation of the active site, measuring on average 0.1-0.2 Å closure across the active site, with little rotation of the active site residues (aside from Tyr511 compared to SsDPPIV and Ser593 compared to HsDPPIV). Due to differences in residue composition, and thus poor alignment, measured comparisons with prokaryotic homologs were not made. B factors (main chain & side chain) for the active site residues are the A chain monomer are (with values shown in parenthesis): Glu195 (39.7 Å2), Glu196 (37.0 Å2), Y511 (34.0 Å2), Ser593 (36.3 Å2), Tyr594 (32.8 Å2), Val619 (37.9 Å2), Trp622 (34.6 Å2), Tyr625 (35.4 Å2), Tyr629 (31.1 Å2), Asp668 (32.7 Å2), Val671 (34.8 Å2), His700 (34.5 Å2). These B factors are lower than the average B factor for all atoms (49.9 Å2), as would be expected of residues that are contained within the core of the protein. When 20 < B < 40, atom position is strong, albeit with 0.5 Å errors being a possibility. This could negate some of the observation that have been stated above, and may indicate the measured distances actually bear little significance. For the structural interpretations and comparisons that have been made, the non-TLS refinement model was selected as the best refinement of the 2 models that were produced from refinement. Both the non-TLS and TLS refined models have an Rfree value which is higher than their R-factor, which is an indication that the model has not been over interpreted with R-factor bias. The non-TLS refined model was selected based primarily on its lower R-factor and Rfree, which can be interpreted as a better model fit to the electron density compared to the TLS model; however other important factors were also considered, as while many of the refinement parameters show little variance between refinements, variance in other parameters vary greatly. B factors in particular took part in this selection, as a great degree of variation between the 3 models was observed. B factors show a correlative relationship with atomic resolution, where an increase in resolution results in a decrease in the B factors, due to decreased motion of the atoms in the model. This relationship can be used to draw comparisons with 38 structures at similar resolution ranges, which would bear similar average B factors. The 2.8 Å solved structure of SmDPPIV does generally agree with the average B factors of the non-TLS models, with average B factors measuring 41.0 Å2 for all protein atoms, 40.5 Å2 for main chain atoms, 41.6 Å2 for side chain atoms and 30.9 Å2 for water molecules(58). The 2.1 Å resolution structure of PgPTP also bears similar values, measuring 38 Å2 for all protein atoms(60). The comparisons do however suggest that average B factors for the non-TLS model are slightly higher than would be expected. This is likely the result of high B factors of chains G & H present in the asymmetric unit; if these chains corresponded with chains A-F, then average B factors would be 43.0 Å2, 40.3 Å2 and 44.5 Å2 for the aforementioned atoms. Exaggeration of these B factors correlates with poor electron density fit of these chains, particularly chain H, which results in either poor superposition of refined chain A onto these chains, or could be indicative that these chains undertake an alternate conformation in the asymmetric unit. In terms of geometric constraints the non-TLS refinement is also relatively sound, with RMS deviations varying very little between non-TLS and TLS refined models. However the refinement with TLS shows better stereochemistry in regards to backbone stereochemistry, with fewer residues in outlying regions. Manual alteration of φ/ψ angles in chain A were made with WinCoot (either with chi angle rotation or peptide bond flipping), and did show improvement in stereochemistry; the original A chain monomer contained 648 (92.3%) residues in favoured regions, 42 (6.0%) residues in allowed regions and 12 (1.7%) residues in outlier region. The refined monomer contained 675 (96%), 23 (3.3%) and 4 (0.7%) residues in allowed/outlier regions respectively, however post-final superpositioning and refinement these were altered to 657 (93.6) 37 (5.3%) and 8 (1.1%) residues, respectively. This could be indicative that attempted resolvement of the stereochemistry of some residues imposed too rigid a geometric fit of the model, which contradicted the electron density fit. However this increase in residues in the A chain alone doesn’t account for similar Ramachandran profiles of the unrefined and non-TLS refined models, possibly indicating that the Ramachandran profile is much poorer in chains which underwent superpositioning by chain A. Comparatively the poorest electron density fitted chain, chain H, contains 590 (84.1%), 81 (11.5%) and 31 (4.4%) in favoured/allowed/outlying regions. Ideally there would be no residues in outlying regions; 98% would be in the favoured regions, while the remaining 2% would be in allowed regions. The stereochemical profile of non-TLS refined PgDPPIV, 90.7%, 6.4% and 2.4% residues in favoured/allowed/outlying regions, has lower percentage of residues in favoured regions and higher percentage of residues in allowed and outlying regions. 39 For comparatives, Rampage assessment of the isolated chain A monomers of the models used in structural alignment shows that the 2.8 Å SmDPPIV has 92%, 7.8% and 0.3%, 1.8 Å SsDPPIV has 95.9%, 29% and 0.1%, 2.1 Å HsDPPIV has 93.8%, 6.1% and 1.0%, and 2.1 Å PgPTP has 96.3%, 3.4% and 0.3% residues in favoured/allowed/outlying residues. It is clear that further refinement of the stereochemistry of the PgDPPIV asymmetric unit is required. During manual refinement with WinCoot, there a few residues which did not optimally fit the electron density; residues Ala402, Ile403 and Leu454, which are shown in figure 27. In the image the 2Fobs-Fcalc map is shown in blue, while the Fobs-Fcalc maps are shown in green (positive density) and red (negative density); where atoms placed in negative density are unfavourable density fits. The side chains of Ala402, Ile403 and Leu454 are visibly located within the red region. Manipulation of the model was unsuccessful in finding geometrically viable alternative conformations. Interestingly Leu454 is contained within the theoretical β-sheet 4 of blade VIII which is missing from the model; however whether this is due to the selected conformation for Leu454 is unknown. It is possible that the missing blades of the βpropeller are the result of bad model building, such as in the presented case of Leu454; however observation of the electron density fit of the predicted β-sheet regions in WinCoot show no abhorrent fitting of atoms to the electron density map. It is likely that the absence of β-sheets comes down to the conformation assumed during crystallisation of the protein. For the catalytic serine there were 2 possible rotamers for the catalytic serine, which ultimately altered hydrogen bonding distances to the nitrogen atom of the imidazole ring of His700, depending on the rotamer selected. In the conformation shown in figure 26, the distance between the Serine oxygen and the His700 nitrogen is 3.3 Å, while in the alternative rotamer conformation this distance is shortened to 2.1 Å. For HsDPPIV, SsDPPIV, SmDPPIV and PgPTP these distances are 2.7 Å, 2.6 Å, 3.6 Å and 2.8 Å respectively. In either conformation, the distance between residues is deviant from those found in other homologs, with distance at the longest hydrogen bond conformation showing similar deviation to SmDPPIV. For the 2.1 Å rotamer, closer hydrogen bonding could be indicative of stronger nucleophile activity of serine; this conformation would mean Ser593 might be orientated towards the binding site when ligand is bound in the acyl intermediate state. Further co-crystallisation with a ligand is required to confirm whether this 2.1 Å distance is altered in the bound state. 40 A B Figure 27: Electron-density map fit of atomic model of residues Ala402/Ile403 (A) and residue Leu454 (B) from chain A. Screenshot taken with WinCoot. Curiously, although structurally PgDPPIV bears the strongest resemblance its prokaryotic homologs; particularly SmDPPIV, the active site bears an uncanny resemblance to the HsDPPIV/SsDPPIV enzymes. This could be indicative that such homology would make the development of a drug-inhibitor with specificity for only PgDPPIV difficult, based on inhibitors which only bind directly the active site. 41 However, slight aforementioned differences in the dimensions of the active site could provide sufficient specificity; whether these are true positioning differences or errors which can be explained by possible 0.5 Å position errors as indicated by B factor analysis of these residues is debatable. Further characterisation of residues found near the active site could be utilised in synthesis of an inhibitor. Although the structure that has been characterised provides some insight into the structure-function relationship of this enzyme, slightly higher b factors compared to homologs, poor stereochemical refinement of the polypeptide backbone and the absence of predicted secondary structures would lead to the conclusion that further refinement of the atomic model is required to better elucidate the structural features of this pathologically important enzyme. Previous attempts at the refinement of single-wavelength anomalous dispersion (SAD) crystals of PgDPPIV with a resolution range down to 2.5 Å produced a refinement model with an R-factor of 22.7% and an Rfree of 28.8% (Mistry, unpublished). Our model that has been produced using MAD reflection data has provided a model of PgDPPIV which has the best fit of the electron density map to date; however a higher resolution structure is still necessary to provide a more admissible error-free model of PgDPPIV. This will likely mean a return to research into more optimised crystallisation conditions of PgDPPIV crystals, which thus far has proven difficult due to the slow rate of crystal growth of current PgDPPIV crystal. Alternatively other avenues of resolution refinement could be considered, including post-crystallographic methods such as crystal dehydration (addition of dehydration solution or extraction of water) or crystal annealing (returning crystal to room temperature after flash-freezing, before returning to cryostream) which reduce the solvent-content of the crystal, increasing the packing of the molecule in the crystal lattice, which has been shown to increase the resolution of some crystals(81). 5 - Acknowledgements I would like to give my thanks to Prof. Vilmos Fülöp for not only advising and helping to direct the course of my work on the project, but also assisting in enquires related to both the computational and theoretical components of the x-ray crystallography and the subsequent processing of data, including data refinement. 42 6 - Bibliography 1. Yang HW, Huang YF, Chou MY. Occurrence of Porphyromonas gingivalis and Tannerella forsythensis in periodontally diseased and healthy subjects. Journal of Periodontology. 2004;75(8):107783. 2. Chen T. P. gingivalis ATCC 33277 (1) Porphyromonas gingivalis Genome Project: The Forsyth Institute; 2002 [cited 2013 2nd June]. Available from: http://www.pgingivalis.org/ATCC33277(1).htm. 3. Bostanci N, Belibasakis GN. Porphyromonas gingivalis: an invasive and evasive opportunistic oral pathogen. Fems Microbiology Letters. 2012;333(1):1-9. 4. Baker PJ, Evans RT, Roopenian DC. ORAL INFECTION WITH PORPHYROMONAS-GINGIVALIS AND INDUCED ALVEOLAR BONE LOSS IN IMMUNOCOMPETENT AND SEVERE COMBINED IMMUNODEFICIENT MICE. Archives of Oral Biology. 1994;39(12):1035-40. 5. DeCarlo AA, Windsor LJ, Bodden MK, Harber GJ, BirkedalHansen B, BirkedalHansen H. Activation and novel processing of matrix metalloproteinases by a thiol proteinase from the oral anaerobe Porphyromonas gingivalis. Journal of Dental Research. 1997;76(6):1260-70. 6. Matsuda N, Takemura A, Taniguchi S, Amano A, Shizukuishi S. Porphyromonas gingivalis reduces mitogenic and chemotactic responses of human periodontal ligament cells to platelet-derived growth factor in vitro. Journal of Periodontology. 1996;67(12):1335-41. 7. Wegner N, Wait R, Sroka A, Eick S, Nguyen KA, Lundberg K, et al. Peptidylarginine Deiminase From Porphyromonas gingivalis Citrullinates Human Fibrinogen and alpha-Enolase Implications for Autoimmunity in Rheumatoid Arthritis. Arthritis and Rheumatism. 2010;62(9):2662-72. 8. Mikuls TR, Thiele GM, Deane KD, Payne JB, O'Dell JR, Yu F, et al. Porphyromonas gingivalis and Disease-Related Autoantibodies in Individuals at Increased Risk of Rheumatoid Arthritis. Arthritis and Rheumatism. 2012;64(11):3522-30. 9. Eganhouse K, Keller JC, Drake D, Grigsby W, Wu-Yuan CD. Attachment of Porphyromonas gingivalis to titanium surfaces. Journal of Dental Research. 1993;72(ABSTR. SPEC. ISSUE):140. 10. Dennison DK, Huerzeler MB, Quinones C, Caffesse RG. CONTAMINATED IMPLANT SURFACES AN IN-VITRO COMPARISON OF IMPLANT SURFACE COATING AND TREATMENT MODALITIES FOR DECONTAMINATION. Journal of Periodontology. 1994;65(10):942-8. 11. Leon R, Silva N, Ovalle A, Chaparro A, Ahurnada A, Gajardo M, et al. Detection of Porphyromonas gingivalis in the amniotic fluid in pregnant women with a diagnosis of threatened premature labor. Journal of Periodontology. 2007;78(7):1249-55. 12. Li L, Messas E, Batista EL, Levine RA, Amar S. Porphyromonas gingivalis infection accelerates the progression of atherosclerosis in a heterozygous apolipoprotein E-deficient murine model. Circulation. 2002;105(7):861-7. 13. Potempa J, Pike R, Travis J. Titration and mapping of the active site of cysteine proteinases from Porphyromonas gingivalis (Gingipains) using peptidyl chloromethanes. Biological Chemistry. 1997;378(34):223-30. 14. Kumagai Y, Konishi K, Gomi T, Yagishita H, Yajima A, Yoshikawa M. Enzymatic properties of dipeptidyl aminopeptidase IV produced by the periodontal pathogen Porphyromonas gingivalis and its participation in virulence. Infection and Immunity. 2000;68(2):716-24. 15. Abiko Y, Hayakawa M, Murai S, Takiguchi H. GLYCYLPROLYL DIPEPTIDYLAMINOPEPTIDASE FROM BACTEROIDES-GINGIVALIS. Journal of Dental Research. 1985;64(2):106-11. 16. Banbula A, Bugno M, Goldstein J, Yen J, Nelson D, Travis J, et al. Emerging family of prolinespecific peptidases of Porphyromonas gingivalis: Purification and characterization of serine dipeptidyl peptidase, a structural and functional homologue of mammalian prolyl dipeptidyl peptidase IV. Infection and Immunity. 2000;68(3):1176-82. 43 17. Oda H, Saiki K, Tonosaki M, Yajima A, Konishi K. Participation of the secreted dipeptidyl and tripeptidyl aminopeptidases in asaccharolytic growth of Porphyromonas gingivalis. Journal of Periodontal Research. 2009;44(3):362-7. 18. Rawlings ND, Polgar L, Barrett AJ. A NEW FAMILY OF SERINE-TYPE PEPTIDASES RELATED TO PROLYL OLIGOPEPTIDASE. Biochemical Journal. 1991;279:907-8. 19. Kotch FW, Guzei IA, Raines RT. Stabilization of the collagen triple helix by O-methylation of hydroxyproline residues. Journal of the American Chemical Society. 2008;130(10):2952-3. 20. Kumagai Y, Yagishita H, Yajima A, Okamoto T, Konishi K. Molecular mechanism for connective tissue destruction by dipeptidyl aminopeptidase IV produced by the periodontal pathogen Porphyromonas gingivalis. Infection and Immunity. 2005;73(5):2655-64. 21. Wu Y-m, Yan J, Chen L-l, Gu Z-y. Association between infection of different strains of Porphyromonas gingivalis and Actinobacillus actinomycetemcomitans in subgingival plaque and clinical parameters in chronic periodontitis. Journal of Zhejiang University-Science B. 2007;8(2):121-31. 22. Lin XH, Wu J, Xie H. Porphyromonas gingivalis minor fimbriae are required for cell-cell interactions. Infection and Immunity. 2006;74(10):6011-5. 23. Irshad M, van der Reijden WA, Crielaard W, Laine ML. In Vitro Invasion and Survival of Porphyromonas gingivalis in Gingival Fibroblasts; Role of the Capsule. Archivum Immunologiae Et Therapiae Experimentalis. 2012;60(6):469-76. 24. Persson GR. Immune responses and vaccination against periodontal infections. Journal of Clinical Periodontology. 2005;32:39-53. 25. Teshirogi K, Hayakawa M, Ikemi T, Abiko Y. Production of monoclonal antibody inhibiting dipeptidylaminopeptidase IV activity of Porphyromonas gingivalis. Hybridoma and Hybridomics. 2003;22(3):147-51. 26. Gilmore BF, Carson L, McShane LL, Quinn D, Coulter WA, Walker B. Synthesis, kinetic evaluation, and utilization of a biotinylated dipeptide proline diphenyl phosphonate for the disclosure of dipeptidyl peptidase IV-like serine proteases. Biochemical and Biophysical Research Communications. 2006;347(1):373-9. 27. Bodet C, Piche M, Chandad F, Grenier D. Inhibition of periodontopathogen-derived proteolytic enzymes by a high-molecular-weight fraction isolated from cranberry. Journal of Antimicrobial Chemotherapy. 2006;57(4):685-90. 28. Weber AE, Kim D, Beconi M, Eiermann G, Fisher M, He HB, et al. MK-0431 is a potent, selective, dipeptidyl peptidase IV inhibitor for the treatment of type 2 diabetes. Diabetes. 2004;53:A151-A. 29. Feng J, Zhang ZY, Wallace MB, Stafford JA, Kaldor SW, Kassel DB, et al. Discovery of alogliptin: A potent, selective, bioavailable, and efficacious inhibitor of dipeptidyl peptidase IV. Journal of Medicinal Chemistry. 2007;50(10):2297-300. 30. Ghate M, Jain SV. Structure Based Lead Optimization Approach in Discovery of Selective DPP4 Inhibitors. Mini-Reviews in Medicinal Chemistry. 2013;13(6):888-914. 31. Rea D, Lambeir AM, Kumagai Y, De Meester I, Scharpe S, Fulop V. Expression, purification and preliminary crystallographic analysis of dipeptidyl peptidase IV from Porphyromonas gingivalis. Acta Crystallographica Section D-Biological Crystallography. 2004;60:1871-3. 32. Tiruppathi C, Miyamoto Y, Ganapathy V, Leibach FH. GENETIC-EVIDENCE FOR ROLE OF DPP-IV IN INTESTINAL HYDROLYSIS AND ASSIMILATION OF PROLYL PEPTIDES. American Journal of Physiology. 1993;265(1):G81-G9. 33. Drucker DJ. Minireview: The glucagon-like peptides. Endocrinology. 2001;142(2):521-7. 34. Ansorge S, Buhling F, Kahne T, Lendeckel U, Reinhold D, Tager M, et al. CD26 dipeptidyl peptidase IV in lymphocyte growth regulation. Cellular Peptidases in Immune Functions and Diseases. 1997;421:127-40. 44 35. Ludwig A, Schiemann F, Mentlein R, Lindner B, Brandt E. Dipeptidyl peptidase IV (CD26) on T cells cleaves the CXC chemokine CXCL11 (I-TAC) and abolishes the stimulating but not the desensitizing potential of the chemokine. Journal of Leukocyte Biology. 2002;72(1):183-91. 36. Busek P, Malik R, Sedo A. Dipeptidyl peptidase IV activity and/or structure homologues (DASH) and their substrates in cancer. International Journal of Biochemistry & Cell Biology. 2004;36(3):408-21. 37. Ludwig K, Fan H, Dobers J, Berger M, Reutter W, Bottcher C. 3D structure of the CD26-ADA complex obtained by cryo-EM and single particle analysis. Biochemical and Biophysical Research Communications. 2004;313(2):223-9. 38. Cheng HC, Abdel-Ghany M, Pauli BU. A novel consensus motif in fibronectin mediates dipeptidyl peptidase IV adhesion and metastasis. Journal of Biological Chemistry. 2003;278(27):24600-7. 39. Loster K, Zeilinger K, Schuppan D, Reutter W. THE CYSTEINE-RICH REGION OF DIPEPTIDYL PEPTIDASE-IV (CD 26) IS THE COLLAGEN-BINDING SITE. Biochemical and Biophysical Research Communications. 1995;217(1):341-8. 40. Blanco J, Valenzuela A, Herrera C, Lluis C, Hovanessian AG, Franco R. The HIV-1 gp120 inhibits the binding of adenosine deaminase to CD26 by a mechanism modulated by CD4 and CXCR4 expression. Febs Letters. 2000;477(1-2):123-8. 41. Herrera C, Morimoto C, Blanco J, Mallol J, Arenzana F, Lluis C, et al. Comodulation of CXCR4 and CD26 in human lymphocytes. Journal of Biological Chemistry. 2001;276(22):19532-9. 42. Ishii T, Ohnuma K, Murakami A, Takasawa N, Kobayashi S, Dang NH, et al. CD26-mediated signaling for T cell activation occurs in lipid rafts through its association with CD45RO. Proceedings of the National Academy of Sciences of the United States of America. 2001;98(21):12138-43. 43. Holst JJ, Deacon CF. Inhibition of the activity of dipeptidyl peptidase IV as a treatment for type 2 diabetes. Diabetes. 1998;47(11):1663-70. 44. Medeiros MD, Turner AJ. Metabolism and functions of neuropeptide Y. Neurochemical Research. 1996;21(9):1125-32. 45. Kajiyama H, Kikkawa F, Suzuki T, Shibata K, Ino K, Mizutani S. Prolonged survival and decreased invasive activity attributable to dipeptidyl peptidase IV overexpression in ovarian carcinoma. Cancer Research. 2002;62(10):2753-7. 46. Proost P, De Meester I, Schols D, Struyf S, Lambeir AM, Wuyts A, et al. Amino-terminal truncation of chemokines by CD26/dipeptidylpeptidase IV - Conversion of RANTES into a potent inhibitor of monocyte chemotaxis and HIV-1-infection. Journal of Biological Chemistry. 1998;273(13):7222-7. 47. Thoma R, Loffler B, Stihle M, Huber W, Ruf A, Hennig M. Structural basis of proline-specific exopeptidase activity as observed in human dipeptidyl peptidase-IV. Structure. 2003;11(8):947-59. 48. Rea D, Fulop V. Structure-function properties of prolyl oligopeptidase family enzymes. Cell Biochemistry and Biophysics. 2006;44(3):349-65. 49. Chien CH, Huang LH, Chou CY, Chen YS, Han YS, Chang GG, et al. One site mutation disrupts dimer formation in human DPP-IV proteins. Journal of Biological Chemistry. 2004;279(50):52338-45. 50. Suzuki Y, Erickson RH, Sedlmayer A, Chang SK, Ikehara Y, Kim YS. DIETARY-REGULATION OF RAT INTESTINAL ANGIOTENSIN-CONVERTING ENZYME AND DIPEPTIDYL PEPTIDASE-IV. American Journal of Physiology. 1993;264(6):G1153-G9. 51. Chung KM, Cheng JH, Suen CS, Huang CH, Tsai CH, Huang LH, et al. The dimeric transmembrane domain of prolyl dipeptidase DPP-IV contributes to its quaternary structure and enzymatic activities. Protein Science. 2010;19(9):1627-38. 52. Chien CH, Tsai CH, Lin CH, Chou CY, Chen X. Identification of hydrophobic residues critical for DPP-IV dimerization. Biochemistry. 2006;45(23):7006-12. 53. Rasmussen HB, Branner S, Wiberg FC, Wagtmann N. Crystal structure of human dipeptidyl peptidase IV/CD26 in complex with a substrate analog. Nature Structural Biology. 2003;10(1):19-25. 45 54. Simpkins LM, Bolton S, Pi Z, Sutton JC, Kwon C, Zhao G, et al. Potent non-nitrile dipeptidic dipeptidyl peptidase IV inhibitors. Bioorganic & Medicinal Chemistry Letters. 2007;17(23):6476-80. 55. Oefner C, D'Arcy A, Mac Sweeney A, Pierau S, Gardiner R, Dale GE. High-resolution structure of human apo dipeptidyl peptidase IV/CD26 and its complex with 1- ({2- (5-iodopyridin-2-yl)amino ethyl}amino)-acetyl -2-cyano-(S)-pyrr olidine. Acta Crystallographica Section D-Biological Crystallography. 2003;59:1206-12. 56. Kim YB, Kopcho LM, Kirby MS, Hamann LG, Weigelt CA, Metzler WJ, et al. Mechanism of Gly-PropNA cleavage catalyzed by dipeptidyl peptidase-IV and its inhibition by saxagliptin (BMS-477118). Archives of Biochemistry and Biophysics. 2006;445(1):9-18. 57. Engel M, Hoffmann T, Wagner L, Wermann M, Heiser U, Kiefersauer R, et al. The crystal structure of dipeptidyl peptidase IV(CD26) reveals its functional regulation and enzymatic mechanism. Proceedings of the National Academy of Sciences of the United States of America. 2003;100(9):5063-8. 58. Nakajima Y, Ito K, Toshima T, Egawa T, Zheng H, Oyama H, et al. Dipeptidyl Aminopeptidase IV from Stenotrophomonas maltophilia Exhibits Activity against a Substrate Containing a 4-Hydroxyproline Residue. Journal of Bacteriology. 2008;190(23):7819-29. 59. Banbula A, Mak P, Bugno M, Silberring J, Dubin A, Nelson D, et al. Prolyl tripeptidyl peptidase from Porphyromonas gingivalis - A novel enzyme with possible pathological implications for the development of periodontitis. Journal of Biological Chemistry. 1999;274(14):9246-52. 60. Ito K, Nakajima Y, Xu Y, Yamada N, Onohara Y, Ito T, et al. Crystal structure and mechanism of tripeptidyl activity of prolyl tripeptidyl aminopeptidase from Porphyromonas gingivalis. Journal of Molecular Biology. 2006;362(2):228-40. 61. Terwilliger TC, Berendzen J. Automated MAD and MIR structure solution. Acta Crystallographica Section D-Biological Crystallography. 1999;55:849-61. 62. Terwilliger TC, Kim SH, Eisenberg D. GENERALIZED-METHOD OF DETERMINING HEAVY-ATOM POSITIONS USING THE DIFFERENCE PATTERSON FUNCTION. Acta Crystallographica Section A. 1987;43:15. 63. Terwilliger TC, Eisenberg D. UNBIASED 3-DIMENSIONAL REFINEMENT OF HEAVY-ATOM PARAMETERS BY CORRELATION OF ORIGIN-REMOVED PATTERSON FUNCTIONS. Acta Crystallographica Section A. 1983;39(SEP):813-7. 64. Terwilliger TC, Eisenberg D. ISOMORPHOUS REPLACEMENT - EFFECTS OF ERRORS ON THE PHASE PROBABILITY-DISTRIBUTION. Acta Crystallographica Section A. 1987;43:6-13. 65. Terwilliger TC. MAD PHASING - TREATMENT OF DISPERSIVE DIFFERENCES AS ISOMORPHOUS REPLACEMENT INFORMATION. Acta Crystallographica Section D-Biological Crystallography. 1994;50:1723. 66. Terwilliger TC. MAD PHASING - BAYESIAN ESTIMATES OF F-A. Acta Crystallographica Section DBiological Crystallography. 1994;50:11-6. 67. Terwilliger TC, Berendzen J. Bayesian correlated MAD phasing. Acta Crystallographica Section DBiological Crystallography. 1997;53:571-9. 68. Emsley P, Cowtan K. Coot: model-building tools for molecular graphics. Acta Crystallographica Section D-Biological Crystallography. 2004;60(Sp. 1):2126-32. 69. Emsley P, Lohkamp B, Scott WG, Cowtan K. Features and development of Coot. Acta Crystallographica Section D. 2010;66(4):486-501. 70. 4 CCPN. The CCP4 suite: programs for protein crystallography. Acta Cryst. 1994;50:760-3. 71. Potterton E, Briggs P, Turkenburg M, E. D. A graphical user interface to the CCP4 program suite. Acta Cryst. 2003;D59:1131-7. 72. Murshudov GN, Skubak P, Lebedev AA, Pannu NS, Steiner RA, Nicholls RA, et al. REFMAC5 for the refinement of macromolecular crystal structures. Acta Crystallogr. 2011;D67:355-67. 46 73. Murshudov GN, Vagin AA, Dodson EJ. Refinement of macromolecular structures by the maximum-likelihood method. Acta Crystallographica Section D-Biological Crystallography. 1997;53:24055. 74. Lovell SC, Davis IW, Arendall III WB, de Bakker PIW, Word JM, Prisant MG, et al. Structure validation by Cα geometry: φ/ψ and Cβ deviation. Proteins: Structure, Function & Genetics; 2002. p. 437-50. 75. Krissinel E, Henrick K. Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Cryst. 2004;D60:2256-68. 76. Vaguine AA, Richelle J, Wodak SJ. SFCHECK: a unified set of procedures for evaluating the quality of macromolecular structure-factor data and their agreement with the atomic model. Acta Crystallographica Section D-Biological Crystallography. 1999;55:191-205. 77. Winn MD, G.N. M, Papiz MZ. Macromolecular TLS refinement in REFMAC at moderate resolutions. Method in Enzymology. 2003;374:300-21. 78. Cockcroft KC. A Hypertext Book of Crystallographic Space Group Diagrams and Tables: Birkbeck College; 1999 [cited 2013 7th June]. Available from: http://img.chem.ucl.ac.uk/sgp/large/004ay1.htm. 79. Sonnhammer EL, von Heijne G, Krogh A. A hidden Markov model for predicting transmembrane helices in protein sequences. Proceedings / International Conference on Intelligent Systems for Molecular Biology ; ISMB International Conference on Intelligent Systems for Molecular Biology. 1998;6:175-82. 80. Krogh A, Larsson B, von Heijne G, Sonnhammer ELL. Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes. Journal of Molecular Biology. 2001;305(3):567-80. 81. Heras B, Martin JL. Post-crystallization treatments for improving diffraction quality of protein crystals. Acta Crystallographica Section D-Biological Crystallography. 2005;61:1173-80. 47 7 - Appendix 7.1 - Protein Sequence materials 7.1.1 - PgDPPIV sequence MKRPVIILLLGIVTMCAMAQTGDKPVDLKEITSGMFYARSAGRGIRSMPDGEHYTEMNRERTAIVRYNYASGKAVDTLFSIERARE CPFKQIQNYEVSSTGHHILLFTDMESIYRHSYRAAVYDYDVRRNLVKPLSEHVGKVMIPTFSPDGRMVAFVRDNNIFIKKFDFDTE VQVTTDGQINSVLNGATDWVYEEEFGVTNLMSWSADNAFLAFVRSDESAVPEYRMPMYEDKLYPEDYTYKYPKAGEKNSTVSLHLY NVADRNTKSVSLPIDADGYIPRIAFTDNADELAVMTLNRLQNDFKMYYVHPKSLVPKLILQDMNKRYVDSDWIQALKFTAGGGFAY VSEKDGFAHIYLYDNKGVMHRRITSGNWDVTKLYGVDASGTVFYQSAEESPIRRAVYAIDAKGRKTKLSLNVGTNDALFSGNYAYY INTYSSAATPTVVSVFRSKGAKELRTLEDNVALRERLKAYRYNPKEFTIIKTQSALELNAWIVKPIDFDPSRHYPVLMVQYSGPNS QQVLDRYSFDWEHYLASKGYVVACVDGRGTGARGEEWRKCTYMQLGVFESDDQIAAATAIGQLPYVDAARIGIWGWSYGGYTTLMS LCRGNGTFKAGIAVAPVADWRFYDSVYTERFMRTPKENASGYKMSSALDVASQLQGNLLIVSGSADDNVHLQNTMLFTEALVQANI PFDMAIYMDKNHSIYGGNTRYHLYTRKAKFLFDNL 7.1.2 - HsDPPIV sequence MKTPWKVLLGLLGAAALVTIITVPVVLLNKGTDDATADSRKTYTLTDYLKNTYRLKLYSLRWISDHEYLYKQENNILVFNAEYGNS SVFLENSTFDEFGHSINDYSISPDGQFILLEYNYVKQWRHSYTASYDIYDLNKRQLITEERIPNNTQWVTWSPVGHKLAYVWNNDI YVKIEPNLPSYRITWTGKEDIIYNGITDWVYEEEVFSAYSALWWSPNGTFLAYAQFNDTEVPLIEYSFYSDESLQYPKTVRVPYPK AGAVNPTVKFFVVNTDSLSSVTNATSIQITAPASMLIGDHYLCDVTWATQERISLQWLRRIQNYSVMDICDYDESSGRWNCLVARQ HIEMSTTGWVGRFRPSEPHFTLDGNSFYKIISNEEGYRHICYFQIDKKDCTFITKGTWEVIGIEALTSDYLYYISNEYKGMPGGRN LYKIQLSDYTKVTCLSCELNPERCQYYSVSFSKEAKYYQLRCSGPGLPLYTLHSSVNDKGLRVLEDNSALDKMLQNVQMPSKKLDF IILNETKFWYQMILPPHFDKSKKYPLLLDVYAGPCSQKADTVFRLNWATYLASTENIIVASFDGRGSGYQGDKIMHAINRRLGTFE VEDQIEAARQFSKMGFVDNKRIAIWGWSYGGYVTSMVLGSGSGVFKCGIAVAPVSRWEYYDSVYTERYMGLPTPEDNLDHYRNSTV MSRAENFKQVEYLLIHGTADDNVHFQQSAQISKALVDVGVDFQAMWYTDEDHGIASSTAHQHIYTHMSHFIKQCFSLP 7.1.3 - SsDPPIV sequence MKTPWKVLLGLLGIAALVTVITVPVVLLNKGTDDAAADSRRTYTLTDYLKSTFRVKFYTLQWISDHEYLYKQENNILLFNAEYGNS SIFLENSTFDELGYSTNDYSVSPDRQFILFEYNYVKQWRHSYTASYDIYDLNKRQLITEERIPNNTQWITWSPVGHKLAYVWNNDI YVKNEPNLSSQRITWTGKENVIYNGVTDWVYEEEVFSAYSALWWSPNGTFLAYAQFNDTEVPLIEYSFYSDESLQYPKTVRIPYPK AGAENPTVKFFVVDTRTLSPNASVTSYQIVPPASVLIGDHYLCGVTWVTEERISLQWIRRAQNYSIIDICDYDESTGRWISSVARQ HIEISTTGWVGRFRPAEPHFTSDGNSFYKIISNEEGYKHICHFQTDKSNCTFITKGAWEVIGIEALTSDYLYYISNEHKGMPGGRN LYRIQLNDYTKVTCLSCELNPERCQYYSASFSNKAKYYQLRCFGPGLPLYTLHSSSSDKELRVLEDNSALDKMLQDVQMPSKKLDV INLHGTKFWYQMILPPHFDKSKKYPLLIEVYAGPCSQKVDTVFRLSWATYLASTENIIVASFDGRGSGYQGDKIMHAINRRLGTFE VEDQIEATRQFSKMGFVDDKRIAIWGWSYGGYVTSMVLGAGSGVFKCGIAVAPVSKWEYYDSVYTERYMGLPTPEDNLDYYRNSTV MSRAENFKQVEYLLIHGTADDNVHFQQSAQLSKALVDAGVDFQTMWYTDEDHGIASNMAHQHIYTHMSHFLKQCFSLP 48 7.1.4 - SmDPPIV sequence MRHLFASLAFMLATSTVAHAEKLTLEAITGPLPLSGPTLMKPKVAPDGSRVTFLRGKDSDRNQLDLWSYDIGSGQTRLLVDSKVVL PGTETLSDEEKARRERQRIAAMTGIVDYQWSPDAQRLLFPLGGELYLYDLKQEGKAAVRQLTHGEGFATDAKLSPKGGFVSFIRGR NLWVIDLASGRQMQLTADGSTTIGNGIAEFVADEEMDRHTGYWWAPDDSAIAYARIDESPVPVQKRYEVYADRTDVIEQRYPAAGD ANVQVKLGVISPAEQAQTQWIDLGKEQDIYLARVNWRDPQHLSFQRQSRDQKKLDLVEVTLASNQQRVLAHETSPTWVPLHNSLRF LDDGSILWSSERTGFQHLYRIDSKGKAAALTHGNWSVDELLAVDEKAGLAYFRAGIESARESQIYAVPLQGGQPQRLSKAPGMHSA SFARNASVYVDSWSNNSTPPQIELFRANGEKIATLVENDLADPKHPYARYREAQRPVEFGTLTAADGKTPLNYSVIKPAGFDPAKR YPVAVYVYGGPASQTVTDSWPGRGDHLFNQYLAQQGYVVFSLDNRGTPRRGRDFGGALYGKQGTVEVADQLRGVAWLKQQPWVDPA RIGVQGWSNGGYMTLMLLAKASDSYACGVAGAPVTDWGLYDSHYTERYMDLPARNDAGYREARVLTHIEGLRSPLLLIHGMADDNV LFTNSTSLMSALQKRGQPFELMTYPGAKHGLSGADALHRYRVAEAFLGRCLKP 7.1.5 - PgPTP sequence MKKTIFQQLFLSVCALTVALPCSAQSPETSGKEFTLEQLMPGGKEFYNFYPEYVVGLQWMGDNYVFIEGDDLVFNKANGKSAQTTR FSAADLNALMPEGCKFQTTDAFPSFRTLDAGRGLVVLFTQGGLVGFDMLARKVTYLFDTNEETASLDFSPVGDRVAYVRNHNLYIA RGGKLGEGMSRAIAVTIDGTETLVYGQAVHQREFGIEKGTFWSPKGSCLAFYRMDQSMVKPTPIVDYHPLEAESKPLYYPMAGTPS HHVTVGIYHLATGKTVYLQTGEPKEKFLTNLSWSPDENILYVAEVNRAQNECKVNAYDAETGRFVRTLFVETDKHYVEPLHPLTFL PGSNNQFIWQSRRDGWNHLYLYDTTGRLIRQVTKGEWEVTNFAGFDPKGTRLYFESTEASPLERHFYCIDIKGGKTKDLTPESGMH RTQLSPDGSAIIDIFQSPTVPRKVTVTNIGKGSHTLLEAKNPDTGYAMPEIRTGTIMAADGQTPLYYKLTMPLHFDPAKKYPVIVY VYGGPHAQLVTKTWRSSVGGWDIYMAQKGYAVFTVDSRGSANRGAAFEQVIHRRLGQTEMADQMCGVDFLKSQSWVDADRIGVHGW SYGGFMTTNLMLTHGDVFKVGVAGGPVIDWNRYEIMYGERYFDAPQENPEGYDAANLLKRAGDLKGRLMLIHGAIDPVVVWQHSLL FLDACVKARTYPDYYVYPSHEHNVMGPDRVHLYETITRYFTDHL 7.1.6 - CLUSTAL W2 multiple sequence alignment Key residues of the active site are colour annotated for easy comparison of alignment. The catalytic triad is annotated in red, the oxyanion stabilising residues are annotated in blue, hydrophobic binding residues are annotated in yellow, and amino-terminus binding residues are annotated in cyan. PgDPPIV HsDPPIV SsDPPIV SmDPPIV PgPTP -----------MKRPVIILLLGIVTMCAMAQTGDKPVDLKEITSGMFYARSAGRG-IRSM MKTPWKVLLGLLGAAALVTIITVPVVLLNKGTDDATADSRKTYTLTDYLKNTYRLKLYSL MKTPWKVLLGLLGIAALVTVITVPVVLLNKGTDDAAADSRRTYTLTDYLKSTFRVKFYTL ---MRHLFASLAFMLATSTVAHAEKLTLEAITGPLPLSGPTLMKPKVAPDGSRVTFLRGK ---MKKTIFQQLFLSVCALTVALPCSAQSPETSGKEFTLEQLMPGGKEFYNFYPEYVVGL . *. . . 48 60 60 57 57 PgDPPIV HsDPPIV SsDPPIV SmDPPIV PgPTP PDGEHYTEMNRERTAIVRYNYASGKAVDTLFSVERARECPFKQIQNYEVSSTGHHILLFT RWISDHEYLYKQENNILVFNAEYGNSS--VFLENSTFDEFGHSINDYSISPDGQFILLEY QWISDHEYLYKQENNILLFNAEYGNSS--IFLENSTFDELGYSTNDYSVSPDRQFILFEY DSDRNQLDLWSYDIGSGQTRLLVDSKVVLPGTETLSDEEKARRERQRIAAMTGIVDYQWS QWMGDNYVFIEGDD--LVFNKANGKSAQTTRFSAADLNALMPEGCKFQTTDAFPSFRTLD . : . .. : . : 108 118 118 117 115 PgDPPIV HsDPPIV SsDPPIV SmDPPIV PgPTP DMESIYRHSYRAAVYDYDVR---RNLVKPLSEHVGKVMIPTFSPDGRMVAFVRDNNIFIK NYVKQWRHSYTASYDIYDLN---KRQLITEERIPNNTQWVTWSPVGHKLAYVWNNDIYVK NYVKQWRHSYTASYDIYDLN---KRQLITEERIPNNTQWITWSPVGHKLAYVWNNDIYVK PDAQRLLFPLGGELYLYDLKQEGKAAVRQLTHGEGFATDAKLSPKGGFVSFIRGRNLWVI AGRGLVVLFTQGGLVGFDML---ARKVTYLFDTNEETASLDFSPVGDRVAYVRNHNLYIA . :*: : . ** * :::: ..:::: 165 175 175 177 172 49 PgDPPIV HsDPPIV SsDPPIV SmDPPIV PgPTP K-----FDFDTEVQVTTDGQINSILNGATDWVYEEEFG-VTNLMSWSADNAFLAFVRSDE I-----EPNLPSYRITWTGKEDIIYNGITDWVYEEEVFSAYSALWWSPNGTFLAYAQFND N-----EPNLSSQRITWTGKENVIYNGVTDWVYEEEVFSAYSALWWSPNGTFLAYAQFND D-----LASGRQMQLTADG-STTIGNGIAEFVADEEMD-RHTGYWWAPDDSAIAYARIDE RGGKLGEGMSRAIAVTIDGTETLVYG---QAVHQREFG-IEKGTFWSPKGSCLAFYRMDQ :* * : . : * :.*. . *:...: :*: : :: 219 230 230 230 228 PgDPPIV HsDPPIV SsDPPIV SmDPPIV PgPTP SAVPEYR-MPMYEDK--IYPEDYTYKYPKAGEKNSTVSLHLYNVADRNTKSVSLPIDADG TEVPLIE-YSFYSDESLQYPKTVRVPYPKAGAVNPTVKFFVVNTDSLSSVTNATSIQITA TEVPLIE-YSFYSDESLQYPKTVRIPYPKAGAENPTVKFFVVDTRTLSPNASVTSYQIVP SPVPVQKRYEVYADR----TDVIEQRYPAAGDANVQVKLGVISPAEQAQTQWIDLGKEQD SMVKPTPIVDYHPLE----AESKPLYYPMAGTPSHHVTVGIYHLATGKTVYLQTGEPKEK : * : . .. ** ** . *.. : 276 289 289 286 284 PgDPPIV HsDPPIV SsDPPIV SmDPPIV PgPTP ---------YIPRIAFTDNADELAVMTLNRLQN-------DFKMYYVHPKSLVAKLILQD PASMLIGDHYLCDVTWAT-QERISLQWLRRIQNYSVMDICDYDESSGRWNCLVARQHIEM PASVLIGDHYLCGVTWVT-EERISLQWIRRAQNYSIIDICDYDESTGRWISSVARQHIEI --------IYLARVNWRD-PQHLSFQRQSRDQK-------KLDLVEVTLASNQQRVLAHE ---------FLTNLSWSPDENILYVAEVNRAQN-----ECKVNAYDAETGRFVRTLFVET :: : : : : . * *: . . . 320 348 348 330 330 PgDPPIV HsDPPIV SsDPPIV SmDPPIV PgPTP MNKRYVDSDWIQALKFTAGGG--FAYVSEKDGFAHIYLYDNKGVMHRRITSGNWDVTKLY STTGWVGRFRPSEPHFTLDGNSFYKIISNEEGYRHICYFQIDKKDCTFITKGTWEVIGIE STTGWVGRFRPAEPHFTSDGNSFYKIISNEEGYKHICHFQTDKSNCTFITKGAWEVIGIE TSPTWVPLHN--SLRFLDDGS--ILWSSERTGFQHLYRIDSKGK-AAALTHGNWSVDELL DKHYVEPLHP---LTFLPGSNNQFIWQSRRDGWNHLYLYDTTGRLIRQVTKGEWEVTNFA . * ... *.. *: *: : :* * *.* : 378 408 408 385 387 PgPTP HsDPPIV SsDPPIV PgDPPIV SmDPPIV GFDPKGTRLYFESTEASPLERHFYCIDIKGGKTKDLTPESGMH-------RTQLSPDGSA ALTSDYLYYISNEYKGMPGGRNLYKIQLSD-YTKVTCLSCELNPERCQYYSVSFSKEAKY ALTSDYLYYISNEHKGMPGGRNLYRIQLND-YTKVTCLSCELNPERCQYYSASFSNKAKY GVDASGTVFYQSAEE-SPIRRAVYAIDAKG-RKTKLSLNVGTN-------DALFSGNYAY AVDEKAGLAYFRAGIESARESQIYAVPLQGGQPQRLSKAPGMH-------SASFARNASV .. . . .* : .. : . :: . 440 467 467 429 438 PgDPPIV HsDPPIV SsDPPIV SmDPPIV PgPTP YINTYSSAATPTVVSVFRSKDAKELRTLE----DNVALRERLKAYRYNPKEFTIIKTQSG YQLRCSGPGLP-LYTLHSSVNDKGLRVLE----DNSALDKMLQNVQMPSKKLDFIILN-E YQLRCFGPGLP-LYTLHSSSSDKELRVLE----DNSALDKMLQDVQMPSKKLDVINLH-G YVDSWSNNSTPPQIELFRANGEKIATLVENDLADPKHPYARYREAQRPVEFGTLTAADGK IIDIFQSPTVPRKVTVTN-IGKGSHTLLE-------AKNPDTGYAMPEIRTGTIMAADGQ . * : . :* . . . 485 521 521 498 492 PgDPPIV HsDPPIV SsDPPIV SmDPPIV PgPTP LELNAWIVKPIDFDPSRHYPVLMVQYSGPNSQQVLDRYS----FDWEHYLASKG-YVVAC TKFWYQMILPPHFDKSKKYPLLLDVYAGPCSQKADTVFR----LNWATYLASTENIIVAS TKFWYQMILPPHFDKSKKYPLLIEVYAGPCSQKVDTVFR----LSWATYLASTENIIVAS TPLNYSVIKPAGFDPAKRYPVAVYVYGGPASQTVTDSWPGRGDHLFNQYLAQQG-YVVFS TPLYYKLTMPLHFDPAKKYPVIVYVYGGPHAQLVTKTWR-SSVGGWDIYMAQKG-YAVFT : : * ** :::**: : *.** :* . : : *:*. * 540 577 577 557 550 PgDPPIV HsDPPIV SsDPPIV SmDPPIV PgPTP VDGRGTGARGEEWRKCTYMQLGVFESDDQIAAATAIGQLPYVDAARIGIWGWSYGGYTTL FDGRGSGYQGDKIMHAINRRLGTFEVEDQIEAARQFSKMGFVDNKRIAIWGWSYGGYVTS FDGRGSGYQGDKIMHAINRRLGTFEVEDQIEATRQFSKMGFVDDKRIAIWGWSYGGYVTS LDNRGTPRRGRDFGGALYGKQGTVEVADQLRGVAWLKQQPWVDPARIGVQGWSNGGYMTL VDSRGSANRGAAFEQVIHRRLGQTEMADQMCGVDFLKSQSWVDADRIGVHGWSYGGFMTT .*.**: :* : * * **: .. : . :** **.: *** **: * 600 637 637 617 610 PgDPPIV HsDPPIV SsDPPIV SmDPPIV PgPTP MSLCRGNGTFKAGIAVAPVADWRFYDSVYTERFMRTPK--ENASGYKMSSALDVAS-QLQ MVLGSGSGVFKCGIAVAPVSRWEYYDSVYTERYMGLPTPEDNLDHYRNSTVMSRAENFKQ MVLGAGSGVFKCGIAVAPVSKWEYYDSVYTERYMGLPTPEDNLDYYRNSTVMSRAENFKQ MLLAKASDSYACGVAGAPVTDWGLYDSHYTERYMDLPAR--NDAGYREARVLTHIE-GLR NLMLTHGDVFKVGVAGGPVIDWNRYEIMYGERYFDAPQE--NPEGYDAANLLKRAG-DLK : .. : *:* .** * *: * **:: * * * : : : 657 697 697 674 667 50 PgDPPIV HsDPPIV SsDPPIV SmDPPIV PgPTP GNLLIVSGSADDNVHLQNTMLFTEALVQANIPFDMAIYMDKNHSIYGGNTRYHLYIRKAK VEYLLIHGTADDNVHFQQSAQISKALVDVGVDFQAMWYTDEDHGIASSTAHQHIYTHMSH VEYLLIHGTADDNVHFQQSAQLSKALVDAGVDFQTMWYTDEDHGIASNMAHQHIYTHMSH SPLLLIHGMADDNVLFTNSTSLMSALQKRGQPFELMTYPGAKHGLSG-ADALHRYRVAEA GRLMLIHGAIDPVVVWQHSLLFLDACVKARTYPDYYVYPSHEHNVMG-PDRVHLYETITR ::: * * * :: : .* . : * . .*.: . * * PgDPPIV HsDPPIV SsDPPIV SmDPPIV PgPTP FLFDNL--FIKQCFSLP FLKQCFSLP FLGRCLKPYFTDHL--:: : 717 757 757 733 726 723 766 766 741 732 7.1.7 - ClustalW2 Sequence alignment Scores Sequence 1 Sequence 2 Score (%) PgDPPIV PgPTP 25.86 PgDPPIV HsDPPIV 25.45 PgDPPIV SsDPPIV 25.45 PgDPPIV SmDPPIV 27.66 PgPTP HsDPPIV 18.31 PgPTP SsDPPIV 19.67 PgPTP SmDPPIV 25.82 HsDPPIV SsDPPIV 88.38 HsDPPIV SmDPPIV 21.59 SsDPPIV SmDPPIV 21.32 Word count excluding abstract, table of contents, figure captions, tables, bibliography and appendix materials: 9,561 51